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(54) Title: PROSTATE-SPECIHC POLYNUCLEOTIDES, POLYPEPTIDES AND THEIR METHODS OF USE 
(57) Abstract 

The invention provides isolated polynucleotides encoding prostate-specific, androgen-rcgulated polypeptides. Jht invention 
also provides substantially pure polynucleotides corresponding to genomic, regulator regions of prostate-specific, androgcn-regulated 
polynucleotides. Fragments and probes of polynucleotides thereof are also provided. The invention further provides a method of diagnosing 
OT predicting the susceptibility of a prostate neoplastic condition in an individual suspected of having a neoplastic condition of the prostate. 
The method consists of: (a) obtaining a fluid or prostate sample of the individual; (b) determining the expression level of the prostate-specific, 
androgen-rcgulated polynucleotide or polypeptide, and (c) comparing the expression levels of the prostate-specific, androgcn-regulatcd 
polynucleotide or polypeptide to expression levels from a normal fluid sample, from normal prostate cells or from an androgen-dependent 
cell line, wherein a two-fold changc'in expression level of the prostate-specific, androgen-rcgulated polynucleotide or polypeptide in the 
individual fluid or prostate sample as compared to the normal fluid or normal prostate cells or an androgen-<lependent cell line indicates 
the presence of a prostate neoplastic condition. Methods of identifying compounds that selectively inhibit or increase prostate-specific 
polypeptides of the invention and a method of treating or reducing die progression of a prostate neoplastic condition are also provided. 
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PROSTATE-SPECIFIC POLYNUCLEOTIDES, POLYPEPTIDES AND 
THEIR METHODS OF USE 

This invention was made with government support under grant number K08 
CA75173-01A1 awarded by the National Institutes of Health. The United States 
5 Government has certain rights in this invention. 

Related Applications 
This application claims the benefit of U.S. provisional patent application Serial 
No. 60/130,778, filed on April 23, 1999, Serial No. 60/151,585, filed on August 30, 
1999, Serial No. 60/174,003, filed on December 30, 1999, and Serial No. 60/177,751, 
10 filed on January 24, 2000. 

Field of the Invention 
This invention relates generally to pfostate cancer and, more specifically, to 
androgen-regulated, prostate-specific, nucleic acid molecules, proteins and aniibodies 
that can be used to diagnose and treat prostate cancer. 
15 Background of the Invention 

Cancer is currently the second leading cause of mortality in the United States. 
However, it is estimated that by the year 2000 cancer will surpass heart disease and 
become the leading cause of death in the United States. Prostate cancer is the most 
common non-cutaneous cancer in the United States and the second leading cause of male 
20 cancer mortality. 

Cancerous tumors result when a cell escapes from its normal growth regulatory 
mechanisms and proliferates in an uncontrolled fashion. As a result of such uncontrolled 
proliferation cancerous tumors usually invade neighboring tissues and spread by lymph 
or blood stream to create secondary or metastatic growths in other tissues. If untreated, 
25 cancerous tumors follow a fatal course. Prostate cancer, due to its slow growth profile, is 
an excellent candidate for early detection and therapeutic intervention. 
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During the last decade, most advances in prostate cancer research have focused 
on prostate specific antigen (PSA), a member of the serine protease family that exhibits a 
prostate-specific expression profile. Serum PSA remains the most widely used tumor 
marker for monitoring prostate cancer, but its specificity is limited by a high frequency of 
5 falsely elevated values in men with benign prostatic hyperplasia (BPH). Other 
biomarkers of prostate cancer progression have proven to be of limited clinical use in 
recent surveys because they are not uniformly elevated in men with advanced prostate 
cancer. Due to the limitations of currently available biomarkers, the identification and 
characterization of prostate specific genes is essential to the development of more 

10 accurate diagnostic methods and therapeutic targets. The clinical potential of novel 
tumor markers can be optimized by either utilizing them in combination with other tumor 
markers or by themselves in the development of diagnostic and treatment modalities. 

Androgens are a class of C19 sieroids that are essential for the development, 
growth, and maintenance of the prostate. Androgens exen their effects on the prostate 

15 target cells via the intracellular androgen receptor (.AR). The AR facilitates androgen- 
induced regulation of genes involved in cellular proliferation and differentiation. As is 
the case with normal prostate development, primary prostatic cancers are largely 
dependent on androgens for growth and survival, imbalance in androgen synthesis and 
degradation in prostate cells can lead to excess androgen, causing excessive cell growth 

20 as seen in benign prostate hyperplasia (BPH) and prostate cancer. Prostate-specific genes 
that contain androgen receptor elements (AREs) necessary for androgen induction 
include PSA, which contains two AREs, and human prostate-specific kallikrein 
(hKLK2). Despite clinical evidence that control of proper intracellular androgen levels 
in prostate cells is critical to a healthy prostate, the molecular components underlying the 

25 development and progression of prostate cancer are poorly understood. Identification of 
the components controlling androgen-regulation of the prostate is imponant for the 
development of new treatment modalities to cure prostate neoplastic conditions. 

Thus, there exists a need for identification of additional genes involved in 
androgen-regulation of the prostate. In addition, there exists a need for identification of 

30 additional prostate specific genes that can be used as diagnostic markers and therapeutic 
targets for prostate cancer. The present invention satisfies this need and provides related 
advantages as well. 

Summary of the Invention 
In accordance with the foregoing, cDNA molecules that are predominantly 
35 expressed in the prostate gland have been isolated and sequenced, and the corresponding 
amino acid sequences have been deduced. Accordingly, the present invention relates to 
isolated, recombinant polypeptides that are expressed in the prostate gland, and to 
isolated polynucleotide sequences which are predominantly expressed in the prostate 
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gland, such as the sequences designated; SEQ ID N0:1, which encodes ARSDRl, a short 
chain dehydrogenase/reductase having the amino acid sequence SEQ ID N0:2; 
polynucleotide SEQ ID N0:3, which encodes TMPRSS2, a serine protease having the 
amino acid sequence SEQ ID N0:4^ polynucleotide SEQ ID NO: 5, which encodes 
5 PART-1, a polypeptide of unknown function having the amino acid sequence SEQ ID 
N0:6; and polynucleotide SEQ ID NO: 7. which encodes SC3, a polypeptide of unknown 
fijncTion. 

In one aspect, the present invention provides an isolated polynucleotide capable 
of hybridizing under stringent conditions to at least 15 contiguous nucleotides of a 
10 polynucleotide sequence selected from the group consisting of SEQ ID N0:1, SEQ ID 
N0:3, SEQ ID N0:5, SEQ ID N0:7, SEQ ID N0:8, SEQ ID N0;9, SEQ ID NO:10 and 
SEQ ID N0:I1. 

In another aspect, the present invention provides a substantially pure 
polynucleotide probe comprising at least 15 contiguous nucleotides of a polynucleotide 
15 sequence selected from the group consisting of SEQ ID N0:1, SEQ ID N0:3, SEQ ID 
N0:5, SEQ ID N0:7, SEQ ID N0:8, SEQ ID N0:9, SEQ ED NO: 10 and SEQ ID NO: 11, 
or a fragment thereof 

In yet another aspect, the present invention provides a substantially pure 
polypeptide comprising substantially an amino acid sequence selected from the group 
20 consisting of the sequences shown as SEQ ID NO:2, SEQ ID N0:6. and functional 
fragments thereof 

Another embodiment of the invention pi'ovides an antibody that specifically binds 
to a polypeptide having an amino acid sequence selected from the group consisting of 
SEQ ID N0:2, SEQ ID N0:4, and SEQ ID N0:6, or a fragment thereof 
25 The invention further provides a method of diagnosing or predicting the 

susceptibility of a prostate neoplastic condition in an individual suspected of having a 
neoplastic condition of the prostate. The method is performed by: 

(a) obtaining a fluid sample from an individual; 

(b) determining an expression level of at least one polypeptide selected from 
30 the group consisting of -\RSDR1, TMPRSS2, and PART-1, and 

(c) comparing said determined expression level of said chosen polypeptide to 
a normal expression level of said chosen polypeptide from a normal fluid sample, 
wherein said measured expression level for said chosen polypeptide of 2-fold or more 
from said fluid sample from said individual compared to said normal' expression level 

35 indicates the presence of a prostate neoplastic condition. Alternatively, the method can 
be performed by obtaining a prostate cell sample from the individual, determining an 
expression level of one of the inventive polypeptides in the prostate cell sample, and 
comparing the prostate expression level to a normal expression level of the 
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corresponding inventive polypeptide from normal prostate cells or from an androgen- 
dependem cell line. Again, a 2-fold or more increase in expression of the inventive 
polypeptide from the prostate cell sample from the individual compared to the normal 
expression level indicates the presence of a prostate neoplastic condition. 
5 Methods of identifying compounds that inhibit or increase the activity of the 

inventive polypeptides and a method of treating or reducing the progression of a prostate 
neoplastic condition are also provided, 

Detailed Description of the Preferred Embodiment 
This invention is directed to prostate localized polypeptides and encoding 

10 polynucleotide molecules. Promoter and regulatory regions of the prostate expressed 
transcripts are also included. More specifically, four different androgen-responsive 
polynucleotides and polypeptides are provided: a polynucleotide having the nucleotide 
sequence shown in SEQ ID NO:L that encodes ARSDRl, a shon-chain 
dehydrogenase/reductase 1 having the polypeptide sequence of SEQ ID NO:2; a 

15 polynucleotide having the nucleotide sequence shown in SEQ ID N0:3 that encodes 
TMPRSS2, a prostate-specific serine protease having the amino acid sequence presented 
in SEQ ID N0:4, and two polynucleotides having the nucleic acid sequences represented 
in SEQ ID N0S:5 and 7, respectively, that encode polypeptides of unknown function. 
Polynucleotide SEQ ID N0S;5 encodes a polypeptide having the amino acid sequences 

20 shown in SEQ ID N0S:6. The polypeptides encoded by the androgen-responsive 
polynucleotides of the present invention are useful as both diagnostic markers for 
neoplastic conditions of the prostate and as targets for therapy. Polynucleotides 
corresponding to the expressed transcripts or promoters and regulatory regions are 
similarly applicable in both diagnostic and therapeutic procedures. 

25 In one embodiment, the invention is directed to polynucleotide transcripts of an 

androgen regulated polynucleotide encoded by one of the nucleotide sequences shown in 
SEQ ID N0S:1, 3, 5, and 7. The invention also pertains to 5' promoter and regulatory 
regions shown in SEQ ID N0:8 (nucleotides 1 to 3,1 13), SEQ ID N0:9, SEQ ID NO: 11 
and a 3' untranslated region (UTR) of TMPRSS2 (SEQ ED NO: 10). The inventive 

30 polynucleotides, fragments of the polynucleotides and short oligonucleotides 
corresponding to unique sequences are useflil in a variety of diagnostic procedures which 
employ probe hybridization methods. One advantage of employing nucleic acid 
hybridization in diagnostic procedures is that very little sample can be used because the 
analyte nucleic acid can be amplified to many copies by, for example, polymerase chain 

35 reaction (PGR) or other well known methods in the an for polynucleotide amplification 
and synthesis. 

In another embodiment, the invention is directed, to substantially pure 
polypeptides and fianctional fragments thereof that are encoded by the polynucleotides of 
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the invention. In particular, the inventive polypeptides can be used to prepare antibodies. 
ARSDRl, TMPRSS2, and PART-1 specific antibodies can be used, following a variety 
of methods that are well known in the an, to diagnose prostate cancer. 

In another embodiment, the invention is directed to methods for diagnosing 
5 prostate neoplastic conditions. The shon-chain dehydrogenase/reductase of the invention 
is primarily expressed in prostate ceils and becomes elevated in response to androgens. 
As such, the polynucleotide sequences of the present invention are applicable alone or in 
combination with other molecules, as a specific marker for prostate ceils and prostate 
neoplastic conditions. 

10 As used herein, the term "nucleotide" means a nionuineric unit ofDNA or RNA 

containing a sugar moiety (pentose), a phosphate and a nitrogenous heterocyclic base. 
The base is linked to the sugar moiety via the glycosidic carbon (1' carbon of pentose) ' 
and that combination of base and sugar is called a nucleoside. The base characterizes the 
nucleotide with the four bases of DNA being adenine ("A"), guanine ("G"), cytosine 

15 ("C") and thymine ("T"). Inosine ("I") is a synthetic base that can be used to substitute 
for any of the four, naturally-occurring bases (A, C, G or T). The four RNA bases are 
A,G,C and uracil ("U"). The nucleotide sequences described herein comprise a linear 
array of nucleotides connected by phosphodiester bonds between the 3* and 5' carbons of 
adjacent pentoses. 

20 "Oligonucleotide" refers to shon length single or double stranded sequences of 

deoxyribonucieotides linked via phosphodiester bonds. The oligonucleotides are 
chemically synthesized by known methods and -purified, for example, on poiyacrylamide 
gels. 

The term "hybridize under stringent conditions", and grammatical equivalents 
25 thereof, means that a polynucleotide molecule that has hybridized to a target 
polynucleotide molecule immobilized on a DNA or RNA blot (such as a Southern blot or 
Northern blot) remains hybridized to the immobilized target molecule on the blot during 
washing of the blot under stringent conditions. In this context, exemplary hybridization 
conditions are: hybridization at 65*'C in 5.0 X SSC, 1% sodium dodecyl sulfate, for 
30 16 hours (lower stringency hybridizations preferably utilize 6.0 X SSC, 1% sodium 
dodecyl sulfate, at 20''C to 30*'C for 16 hours). Exemplary very high stringency 
conditions for washing DNA or RNA blots are: two washes of fifteen minutes each at 
.20°C to 30°C in 2.0 X SSC, followed by two washes of twenty minutes each at 65^C in 
0.5 X SSC. Exemplary high stringency conditions for washing DNA or RNA blots are; 
35 two washes of twenty minutes each at 20°C to 30°C in 2.0 X SSC, followed by one wash 
of thirty minutes at 55''C in 1.0 X SSC. Exemplary moderate stringency conditions for 
washing DNA or RNA blots are: two washes of twenty minutes each at 20'^C to 30°C in 
3.0 X SSC. Preferably, moderate stringency wash conditions are utilized after 



wo 00/65067 



PCT/USOO/10920 



hybridization in lower stringency hybridization conditions, i.e.^ 6.0 X SSC, 1% sodium 
dodecyl sulfate, at 20X. to 30''C for 16 hours. 

As used herein, the term "polynucleotide" refers to a deoxyribonucleic acid 
(DNA) or ribonucleic acid (RNA) molecule that can optionally include one or more 
5 non-native nucleotides, having, for example, one or more modifications to the base, 
sugar, or phosphate portion, or can include a modified phosphodicstcr linkage. The term 
polynucleotide includes both single-stranded and double-stranded polynucleotide 
molecules, which can represent the sense strand, anti-sense strand, or both, and includes 
linear, circular and branched conformations. Exemplary polynucleotides include 

10 genomic DNA, cDNA, mRNA and oligonucleotides, corresponding to either the coding 
or non-coding portion of the molecule. A polynucleotide of the invention can 
additionally contain, if desired, a detectable moiety such as a radiolabel, fluorochrome, 
ferromagnetic substance, luminescent tag or a detectable agent such as biotin. 

As used herein, the term "isolated" in regard to a polynucleotide of the invention, 

15 is intended to mean a polynucleotide whose structure is not identical to that of any 
naturally occurring polynucleotide or to that of any fragment of a naturally occurring 
genomic polynucleotide spanning more than three separate genes. The term therefore 
includes, for example, (a) a DNA which has the sequence of part of a naturally occurring 
genomic DNA molecule but is not flanked by both of the coding sequences that flank that 

20 part of the molecule in the genome of the organism in which it naturally occurs; (b) a 
polynucleotide incorporated into a vector or into the genomic DNA of a prokaryote or 
eukaryote in a manner such that the resulting molecule is not identical to any naturally 
occurring extrachromosomally replicating DNA or genomic DNA; (c) a separate 
molecule such as a cDNA, a genomic fragment, a fragment produced by polymerase 

25 chain reaction (PCR), or a restriction endonuclease polynucleotide fragment; and (d) a 
recombinant nucleotide sequence that is part of a hybrid gene, i.e., a gene encoding a 
fiision protein. Specifically excluded from this definition are polynucleotides present in 
mixtures of (i) DNA molecules, (ii) transfected cells, and (iii) cell clones, e.g., as these 
occur in a DNA library such as a cDNA or genomic library. 

30 As used herein, the term "isolated" in regard to a polypeptide of the invention, is 

intended to mean a molecule that is substantially free from cellular components or other 
contaminants that are associated with the molecule as it is found in nature. "Substantially 
pure" or "substantially free" means, in one illustrative aspect of the invention, purified to 
a purity level of about 85%. In other aspects, these terms denote a purity of at least 90%. 

35 In yet other aspects, these terms refer to a purity level of at least 95%. A substantially 
pure polynucleotide or polypeptide will generally resolve as a band by gel 
electrophoresis, and generate a nucleotide or amino acid sequence profile consistent with 
a predominant species. 
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As used herein, the terms "amino acid" and "amino acids" refer to all naturally 
occurring L-a-amino acids or their residues. The amino acids are identified by either the 
single-letter or three-letter designations: 



Asp 


D 


aspartic acid 


He 


1 


isoleucine 


Thr 


T 


threonine 


Leu 


L 


leucine 


Scr 


S 


serine 


Tyr 


Y 


tyrosine 


Glu 


E 


glutamic acid 


Phe 


F 


phenylalanine 


Pro 


P 


proline 


His 


ir 


histidine 


Gly 


G 


glycine 


Lys 


K 


lysine 


Ala 


A 


alanine 


Aig 


R 


ajgiiiine 


Cys 


C 


cysteine 


Trp 


w 


tryptophan 


Val 


V 


valine 


Gin 


Q 


glutamine 


Met 


M 


methionine 


Asn 


N 


asparagine 



As used herein, the term "ARSDRl" refers to a polypeptide termed androgen 

15 regulated short-chain dehydrogenase/reductase 1, which has substantially the same amino 
acid sequence as shown in SEQ ID N0:2. ARSDRl is a member of the short-chain 
dehydrogenase/reductase superfamily and is predominantly expressed in normal and 
neoplastic prostate epithelium. The ARSDRl polypeptide is encoded by an 
approximately 2.5 kb message having the nucleic acid sequence represented in SEQ ID 

20 N0:1. The ARSDRl promoter and regulatory region is approximately 3.1 kb in size and 
has the sequence shown as nucleotides 1 to 3,113 of SEQ ID N0:8 (genomic nucleotide 
sequence of ARSDRl). The ARSDRl promoter contains an androgen response element 
(ARE) at nucleotides 2,246 to 2,2559 of SEQ ID N0:8 (Roche et al., MoL 
Endocrinol. 6: 2219-2135 (1992)) as well as two progesterone responsive elements 

25 (PREs) at positions 2,175 to 2,189 and 2,627 to 2,641 in SEQ ID N0S;8 (Lieberman et 
^IMoL Endocrinol 7; 515-527(1993)). 

As used herein, the term "TMPRSS2" is intended to refer to a polypeptide having 
substantially the same amino acid sequence as presented in SEQ ID N0:4. The 
TMPRSS2 polypeptide sequence was also previously described by Paoloni-Giacobino et 

30 al.. Genomics 44-309-329 (1997). Briefly, TMPRSS2 is an androgen-regulated serine 
protease expressed in normal and neoplastic prostate epithelium. The TMPRSS2 
polypeptide is encoded by an approximately 3.8 kb message having the nucleic acid 
sequence shown in SEQ ED NO: 3. The TMPRSS2 promoter and regulatory region is 
approximately 0.9 kb in size and has the nucleotide sequence presented in SEQ ID N0:9. 

35 The TMPRSS2 promoter region contains an androgen response element (.ARE) at 
nucleotides 576 to 590 of SEQ ID N0:9. 

As used herein, the term "PART-1" refers to a polypeptide termed prostate 
androgen-regulated transcript, which has substantially the same amino acid sequence as 
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-8- 

shown in SEQ ID N0:6. PART-] is encoded by an androgen-regulated cDNA whose 
nucleotide sequence is represented in SEQ ID N0:5. The PART-1 polypeptide is 
encoded by an approximately 2, 1 kb message. The promoter and regulatory region of the 
polynucleotide encoding PART- 1 is contained in an about 2 kb base pair region having 
5 the sequence shown in SEQ ID NO: 11. The PART-1 promoter region contains a putative 
binding site for the homeo-domain containing protein Pbx-la (Van Dijk et al., Proc. Natl. 
Acad. Sci 90:6061-6065 (1993)) at nucleotides 536 to 544 of SEQ ID N0:1 1. 

As used herein, the term "fragment" as used in reference to a substantially pure 
polynucleotide of the present invention is intended to refer to a portion of the 

10 polynucleotide molecule having the ability to selectively hybridize with the parent 
polynucleotide molecule. The term "selectively hybridize" refers to an ability to bind the 
parent polynucleotide molecule without substantial cross-reactivity with a molecule that 
is not the parent polynucleotide molecule. Therefore, the term includes specific 
hybridization where there is little or no detectable cross-reactivity with other 

15 polynucleotide molecules. The term also includes minor cross-reactivity with other 
molecules provided hybridization to the subject polynucleotide molecule is 
distinguishable from hybridization to the cross-reactive species. Thus, a fragment of a 
polynucleotide of the invention can be used, for example, as a PCR primer to selectively 
amplify a nucleic acid molecule of the invention; as a selective primer for 5' or 3' RACE 

20 to determine additional 5' or 3' sequence of a polynucleotide molecule of the invention; 
as a selective probe to identify or isolate a polynucleotide of the invention on a RNA or 
DNA blot, or genomic or cDNA library; or as a selective inhibitor of transcription or 
translation of an inventive polynucleotide in a tissue, cell or cell extract. 

The following GenBank Expressed Sequence Tags are specifically excluded as 

25 fragments of the invention; 

1) ARSDRl related fragments; (EST) Aa'o35790, AA 442517, AA 587226, 
AA 454187, AI 659469, AA 076597, AA 828243, AJ 753763, AI 051146, Also 
excluded as a fragment of the invention is the BAC clone R-1012A1 (GenBank accession 
number; AL 049779). 

30 2) TMPRSS2 related fragments; (EST) AI 393270, AA 60224, 

PN_10Dll_bd.rl, AT 660243, AI 674580, AA 2258 18, AA 534046, D25996, 
AA 876896. Also excluded as a fragment of the invention is the 216 bp nucleic acid 
described in Paoloni-Giacobino et al., siipra. 

3) P.ART-I related fragments: (EST) AA 410580, AA 640889, AI 627693, 
35 AI 269149, AA 419011, AA 569503, AJ 870129, AA 226501, AA 226220 . 

A fragment of a polynucleotide molecule of the invention includes at least 
about 15 contiguous nucleotides from the reference polynucleotide or a complementary 
sequence thereto, can include at least about 16, 17, 18, 19, 20 or at least 25 nucleotides, 
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often includes at least about 30, 40, 50. 100, 300 or 500 nucleotides, and can include up 
to the ftjll length of the reference polynucleotide molecule minus one nucleotide 
Fragments of such lengths are able to selectively hybridize with the subject 
polynucleotide in a variety of detection formats described herein. 
5 As used herein, the term "functional fragment," when used in reference to a 

polynucleotide comprising the AKSDRi polynucleotide (SEQ ID N0;8), is intended to 
refer to any portion of the ARSDRl polynucleotide having at least one of the biological 
activities of the subject polynucleotides Thus, a functional fragment can be a portion of 
the polynucleotide that enhances or suppresses transcription. For example, a functional 

10 fragment of the ARSDRl polynucleotide (SEQ ID N0:8) may contain an androgen 
response element (ARE) located at nucleotides 2,246 to 2,259 of SEQ ID N0:8 that 
exhibits increased expression upon androgen exposure. Alternatively, a functional 
fragment of the ARSDRl polynucleotide may contain a progesterone response element 
(PRE) located at nucleotides 2,175 to 2,189 of SEQ ID NO: 8 and nucleotides 2,627 to 

15 2,641 of SEQ ID N0:8, respectively, relative to the transcription start site that exhibits 
increased expression upon progesterone exposure. 

As used herein, the term "fiinctional fragment" when used in reference to the 5* 
promoter and regulatory region of TMPRSS2 (SEQ ID NO: 9) is intended to refer to a 
portion of SEQ ID NO: 10 having at least one of the activities of its parent polynucleotide 

20 molecule. For example, a functional fragment of SEQ ID NO: 9 may contain an ARE 
located at nucleotides 576-590 of SEQ ID N0:9 that exhibits increased expression upon 
androgen exposure. 

As used herein, the term "functional fragment" when used in reference to a 
polypeptide of the invention, is intended to refer to a peptide fragment that is a portion of 

25 a ftill length polypeptide, provided that the portion has a biological activity that is 
characteristic of the corresponding full length polypeptide. The term is also intended to 
include polypeptides that include, for example, modified forms of naturally occurring 
amino acids such as D-steroisomers, non-naturally occurring amino acids, amino acid 
analogues and mimetics so long as such polypeptides retain functional activity as defined 

30 below. 

More specifically, the term "functional fragment" when used in reference to 
anARSDRl polypeptide, refers to any peptide sequence which can be identified using the 
binding and routine methods, such as bioassays described herein. An ARSDRl 
polypeptide functional fragment can be, for example, a N.AD(H)/N.ADP(H) binding site 
35 referenced herein as amino acids 44 to 50 of SEQ ID N0:2 or a catalytic activity site 
referenced as amino acids 198 to 202 of SEQ ID N0:2. 

As used herein, the term "fijnctional fragment" when used in reference to 
a PART-1, or TMPRSS2 polypeptide is intended to refer to a ponion of the polypeptide 
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which retains some or all of prostate-specifity and androgen regulated expression of the 
full length polypeptides shown in SEQ ID N0S:4 and 6. 

The term "substantially the nucleotide sequence," as used herein in reference to a 
polynucleotide of the invention, is intended to mean one of the sequences shown as 
5 SEQ ED N0S:1, 3, .5, 7, 8, 9, 10, and 11 or a similar, non-identicai sequence that is 
considered by those skilled in the an to be a flinctionally equivalent sequence, For 
example, a polynucleotide sequence that has one or more nucleotide additions, deletions 
or substitutions with respect to the subject polynucleotide is encompassed by the 
invention, so long as the polynucleotide sequence encodes the same amino acid sequence 

10 or retains its ability to selectively hybridize with the subjeci polynucleoiide. A 
polynucleotide having substantially the sequence of one of the subject polynucleotides 
can encode, for example, an isotype variant or species homoiog. In addition, a 
polynucleotide having substantially the nucleotide sequence of the reference 
polynucleotide has at least 60% identity with respect to the reference nucleotide 

15 sequence. A polynucleotide having substantially the same nucleotide sequence of the 
reference polynucleotide can have at least 70%, at least 90%, or at least 95% identity to. 
the reference nucleotide sequence. 

Sequence comparisons between two (or more) polynucleotides or polypeptides 
are typically performed by comparing sequences of the two sequences over a 

20 "comparison window" to identify and compare local regions of sequence similarity. A 
"comparison window", as used herein, refers to a segment of at least about 20 contiguous 
positions, usually about 50 to about 200, more usually about 100 to about 150 in which a 
sequence may be compared to a reference sequence of the same number of contiguous 
positions after the two sequences are optimally aligned. 

25 Optimal alignment of sequences for comparison may be conducted by local 

identity or similarity algorithms such as those described in Smith and Waterman {Adv. 
Appl. Math. 2:482 (1981)), by the homology alignment algorithm of Needleman and 
Wunsch, (y. Mol Biol. 48:445 (1970)), by the search for similarity method of Pearson 
and Lipman, {Proc. Natl. Acad Sc/. 85:2444 (1988)), or by the algorithm of Karlin and 

30 Altschul {Proc. Natl. Acad Sd. 87:2264-2268 (1990); Proc. Natl. Acad Sci. 90:5873- 
5877 (1990)). Computerized implementations of these algorithms are commonly used in 
the art, such as: GAP, BESTFIT, BLAST, BLASTP2.0.9, TBLASTN, FASTA, and 
TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group 
(GCG), 575 Science Dr., Madison, Wis.; Atlschul et al., 1997), or NBLAST. and 

35 XBLAST (Altschul et ai., J. Mo/. Biol. 215:403-410, (1990)), Sec also 
http://www.ncbi.nim.nih.20v . To obtain gapped alignments for comparison purposes, 
Gapped BLAST is nutilized as described in Altschul et al. {Nucleic Acids Res. 25:3389- 
3402 (1997). 
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The term "percent identity" means the percentage of amino acids or nucleotides 
that occupy the same relative position when two amino acid sequences, or two nucleic 
acid sequences are aligned side by side using the BLAST programs available at 
http://vAvw.ncbi.nlm.nih.gov. BLAST nucleotide searches are performed with the 

5 NBLAST program, score = 100, wordlength = 12, to obtain nucleotide sequence identity, 
BLAST protein searches are performed with the XBLAST program, score = 50, 
wordlength = 3, to obtain amino acid sequence identity. To obtain gapped alignments for 
comparison purposes, Gapped BLAST is utilized as described in Altschul et al., {Nucleic 
Acids Res. 25:3389-3402 (1997)). When utilitzing BLAST and Gapped BLAST 

10 programs, llie default parameter of the respective programs (e.g., XBLAST and 
NBLAST) are used. Neither N- or C- termmal extensions nor insertions shall be 
construed as reducing sequence identity. See http://vAvw.ncbi.nlm.nih.gov . 

The term "percent similarity" is a statistical measure of the degree of relatedness 
of two compared protein sequences. The percent similarity is calculated by a computer 

15 program that assigns a numerical value to each compared pair of amino acids based on 
chemical similarity (e.g., whether the compared amino acids are acidic, basic, 
hydrophobic, aromatic, etc.) and/or evolutionary distance as measured by the minimum 
number of base pair changes that would be required to convert a codon encoding one 
member of a pair of compared amino acids to a codon encoding the other member of the 

20 pair. Calculations are made after a best fit alignment of the two sequences have been 
made empirically by iterative comparison of all possible alignments. (Henikoff et al., 
1992 Proc. Nail. Acad Sci. USA 89:10915-10^19). 

The term "substantial identity" of polynucleotide sequences means that a 
polynucleotide comprises a sequence that has at least 60% sequence identity, preferably 

25 at least 70%, more preferably at least 80% and most preferably at least 90%, compared to 
a reference sequence using the programs described above (preferably BLAST) using 
standard parameters. One of skill will recognize that these values can be appropriately 
adjusted to determine corresponding identity of proteins encoded by two nucleotide 
sequences by taking into account codon degeneracy, amino acid similarity, reading frame 

30 positioning and the like. Substantial identity of amino acid sequences for these purposes 
normally means sequence identity of at least 60%, preferably at least 70%, more 
preferably at least 80%. Polypeptides which are "substantially similar" share sequences 
as noted above except that residue positions which are not identical may. differ by 
conservative amino acid changes. Conservative amino acid substitutions refer to the 

35 interchangeability of residues having similar side chains. For example, a group of amino 
acids having aliphatic side chains is glycine, alanine, valine, leucine, and isoleucine; a 
group of amino acids having aliphatic-hydroxyl side chains is serine and threonine; a 
group of amino acids having amide-containing side chains is asparagine and glutamine; a 
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group of amino acids having aromatic side chains is phenylalanine, tyrosine, and 
tryptophan; a group of amino acids having basic side chains is lysine, arginine, and 
histidine; and a group of amino acids having sulfur-containing side chains is cysteine and 
methionine. Preferred conservative amino acids substitution groups are: valine-leucine- 
5 isoleucine, phenylalanine-tyrosine, lysine-arginine, alanine-valine, and asparagine- 
glutamine. 

As used herein, the term "substantially the amino acid sequence" when used in 
reference to a TVIPRSS2 polypeptide is intended to refer to any amino acid sequence 
having at least about 56% identity with respect to the reference amino acid sequence 

10 shown as SHQ ID N0:4. A polypeptide having substantially the same amino acid 
sequence as the reference polypeptide can have, for example, 60%. 70%, 80%, 90% or 
more amino acid sequence identity to the reference amino acid sequence shown as SEQ 
ID N0:4. Amino acid sequence identity can be determined, for example, in the 
following manner. The ponion of the amino acid sequence of TMPRSS2 (SEQ ID 

15 N0:4) extending from amino acid 1 up to and including amino acid 492 is used to search 
a nucleic acid sequence database, such as the Genbank database, using the program 
BLASTP version 2.0.9 (Altschul et al., 1997 Nucleic Acids Res. 25:3589-3402). 

As used herein, the term "substantially the amino acid sequence" when used in 
reference to an ARSDRl polypeptide is intended to refer to any amino acid sequence 

20 having at least about 26% identity with respect to the reference amino acid sequence 
shown as SEQ ID N0:2. A polypeptide having substantially the same amino acid 
sequence as the reference polypeptide can have, for example, 30%, 40%, 50%, 60%, 
70%, 80%, 90% or more amino acid sequence identity to the reference amino acid 
sequence shown as SEQ ID NO: 2. 

25 A polypeptide having substantially the amino acid sequence of the reference 

polypeptide retains comparable functional and biological activity characteristic of the 
reference polypeptide. It is recognized, however that polypeptides, or encoding nucleic 
acids, containing less than the described levels of sequence identity arising as splice 
variants or that are modified by conservative amino acid substitutions are also 

30 encompassed within the scope of the present invention. 

As used herein, the term "probe" is intended to refer to a single-stranded 
polynucleotide, or analogs thereof, that has a sequence of nucleotides that includes at 
least 10, at least 20, at least 50, at least 100, at least 200, at least 300, at least 400, or at 
least 500 contiguous bases that are the same as or the complement of any contiguous 

35 bases set forth in any of SEQ ID N0S:1, 3, 5, 7, 8, 9, 10 and 11. In addition, the entire 
sequence corresponding to SEQ ID NOS:i, 3, 5, 7, 8, 9, 10 and 11 can be used as a 
probe. A probe has the ability to selectively hybridize to its subject polynucleotide 
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molecule and can be labeled by methods well-known in the art, as described hereinafter, 
and used, for example, in various diagnostic kits. 

The term "antibody" encompasses polyclonal and monoclonal antibody 
preparations, as weJl as preparations including hybrid antibodies, altered antibodies, 
5 F(ab')2 fragments, F(ab) molecules, Fv fragments, single domain antibodies, chimeric 

antibodies and functional fragments thereof which exhibit irnniunulugical binding 
properties of the parent antibody molecule. 

As used herein, the term "monoclonal antibody" refers to an antibody 
composition having a homogeneous antibody population. The term is not limited by the 
10 manner in which it is made. The term encompasses whole immunoglobulin molecules, 
as well as Fab molecules, F(ab')2 fragments, Fv fragments, and other molecules that 

exhibit immunological binding propenies of the parent monoclonal antibody molecule. 
Methods of making polyclonal and monoclonal antibodies are known in the art and 
described more fiiUy below. 
15 The term "antigen" is defined herein to include any substance that may be 

specifically bound by an antibody molecule. An "immunogen" is an antigen that is 
capable of initiating lymphocyte activation resulting in an antigen-specific immune 
response. 

The term "epitope" is used herein to mean a site on an antigen to which specific 

20 B-cells and T-cells respond. The term is also used interchangeably with "antigenic 
determinant" or "antigenic determinant site." A peptide epitope can comprise 3 or more 
amino acids in a spatial conformation unique' to the epitope, Generally, an epitope 
consists of at least 5 such amino acids and, more usually, consists of at least 8-10 such 
amino acids. Methods of determining spatial conformation of amino acids are known in 

25 the art and include, for example, x-ray crystallography and 2-dimensional nuclear 
magnetic resonance spectroscopy. Furthermore, the identification of epitopes in a given 
protein is readily accomplished using techniques well known in the art. See, e.g., Geysen 
et al. Proc. Natl. Acad Sci. 81:3998 (1984)(general method of rapidly synthesizing 
peptides to determine the location of immunogenic epitopes in a given antigen), U.S. Pat. 

30 No. 4,708,871 (procedures for identifying and chemically synthesizing epitopes of 
antigens); and Geysen et al. Mo/ecu/ar ImnmnoJogy 23:709 (i986)(technique for 
identifying peptides with high affinity for a given antibody), Antibodies that recognize 
the same epitope can be identified in a simple immunoassay showing the ability of one 
antibody to block the binding of another antibody to a target antigen. 

35 A "standard" as used herein is a quantitative or quahtative measurement of a 

compound at a known concentration used for comparing samples with unknown 
concentrations of the same or relaxed compounds, Preferably, it is based on a statistically 
appropriate number of samples and is created to use as a basis of comparison when 
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performing diagnostic assays. Diagnostic assays may in turn be used for monitoring 
clinical trials, or following patient treatment profiles. 

As described herein, the term "prostate neoplastic condition" is intended to refer 
to a benign or malignant and metastatic prostate lesion of proliferating cells. For 
5 example, primary prostate tumors are classified into stages TX, TO, Tl, T2, T3, and T4. 
Metastatic prostate cancer is classified into stages Dl, D2, and D3. The term is also 
intended to include prostate neopiasma. 

As used herein, the term "'sample" is intended to mean any biological fluid, cell, 
tissue, organ or portion thereof, that includes or potentially includes nucleic acids and 

10 polypeptides of the invention. The term includes samples present in an individual as well 
as samples obtained or derived from the individual For example, a sample can be a 
histologic section of a specimen obtained by biopsy, or cells that are placed in or adapted 
to tissue culture. A sample fiirther can be a subcellular fraction or extract, or a crude or 
substantially pure nucleic acid or protein preparation. A sample can be prepared by 

15 methods known in the art suitable for the panicular format of the detection method. 

As used herein, the term "detectable label'* refers to a molecule that renders a 
nucleic acid of the invention detectable by an analytical method. An appropriate 
detectable label depends on the particular assay format and are well known by those 
skilled in the art. For example, a detectable label speciric for a polynucleotide molecule 

20 can be a complementary polynucleotide molecule, such as a hybridization probe, that 
selectively hybridizes to the polynucleotide molecule. A hybridization probe can be 
labeled with a measurable moiety, suth as a radioisotope, fluorochrome, 
chemiluminescent marker, biotin, or other moiety known in the art that is measurable by 
analytical methods. A detectable label also can be a polynucleotide molecule without a 

25 measurable moiety. For example, PGR or RT-PCR primers can be used without 
conjugation to selectively amplify alt or a desired portion of the polynucleotide molecule. 
The amplified polynucleotide can then be detected by methods known in the art. 

As used herein, the term "binding agent" when used in reference to ARSDRl, 
TMPRSS2, and PART-1 polypeptides is intended to mean a compound, a 

30 macromolecule, including polypeptide, DNA, RNA and carbohydrate that selectively 
binds a reference polypeptide or fragment thereof For example, a binding agent can be a 
polypeptide that selectively binds with high affmity or avidity to the polypeptides of the 
present invention, without substantial cross-reactivity with other polypeptides that are 
unrelated to the reference polypeptide. The affinity of a binding agent that selectively 

35 binds to a reference polypeptide will generally be greater than about 10'^ M and more 
usually greater than about 10'^ M. High affinity interactions can be preferred, and will 
generally be greater than about 10"^ M to 10'^ M. Specific samples of such selective 
binding agents include a polyclonal or monoclonal antibody specific or selective for a 
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polypeptide of the present invention or a peptide, polynucleotide, nucleic acid, nucleic 
acid derivative, steroid or steroid analog, small organic molecule, identified, for example, 
by affinity screening of a library. For certain applications, a binding agent can be utilized 
that preferentially recognizes a particular conformational or post-translationally modified 
5 state of ARSDKl, TMPRSS2, or PART-1 polypeptides. The binding agent can be 
labeled with a detectable moiety, if desired, or rendered detectable by specific binding to 
a detectable secondary binding agent. 

As used herein, the term "expression level" when used in reference to ARSDRl, 
TMPRSS2, and PART-1 is intended to refer to the extent, amount or rate of synthesis of 

10 the nucleotide sequences shown as SEQ ID NOS.l, 3, 5, 8, 9, 10 and 11 or the 
polypeptides shown as SEQ ID NOS; 2, 4 and 6. The extent, amount or rate of synthesis 
can be determined by measuring the accumulation or synthesis of the reference RNA, 
reference polypeptide or by measuring the reference polypeptide activity. 

As used herein, the term "analog" when used in reference to a short-chain 

15 dehydrogenase/reductase substrate is intended to mean any agent which can be oxidized 
or reduced in the presence of ARSDRl. For example, the short-chain 
dehydrogenase/reductase substrate analog can be a heterocyclic organic compound 
having minor modifications of the short-chain dehydrogenase/reductase substrate amino 
acid sequence. Within the biological ans, the term "about" when used in reference to a 

20 particular activity or measurement is intended to refer to the referenced activity or 
measurement as being within a range of values encompassing the referenced value and 
within accepted standards of a credible assay within the art, or within accepted statistical 
variance of a credible assay within the art. 

As used herein, the term "analog" when used in reference to a serine protease 

25 substrate is intended to mean any agent which is cleaved at about the same rate in the 
presence of TMPRSS2 as the referenced polypeptide. For example, the serine protease 
substrate analog can be a peptide having minor modifications of the serine protease 
substrate amino acid sequence. 

As used herein, the term "inhibitor" when used in reference to ARSDRl is 

30 intended to refer to an agent effecting a decrease in the extent, amount or rate of 
ARSDRl expression or effecting a decrease in the activity of .\RSDRI. For example, 
one group of inhibitors which decrease the activity of .ARSDRl, include short-chain 
dehydrogenase/reductase inhibitors. Specific examples of short-chain 

dehydrogenase/reductase inhibitors include, for example, steroids, steroid derivatives and 

35 analogs. Other examples of ARSDRl inhibitors which effect a decrease in ARSDRl 
expression include ARSDRl antisense polynucleotides and transcriptional inhibitors that 
bind to the ARSDRl 5' promoter and regulatory region. 
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As used herein, the term ''inhibitor" when used in reference to TMPRSS2 is 
intended to refer to an agent effecting a decrease in the extent, amount or rate of 
TMPRSS2 expression or effecting a decrease in the activity of TMPRSS2 activiiy. For 
example, one group of inhibhors which decrease the activity of TMPRSS2, include 
5 serine protease inhibitors. Specific examples of serine protease inhibitors include, for 
example, antitrypsin and antithrombin. Examples of TMPRSS2 inhibitors which effect a 
decrease in TMPRSS2 expression include TMPRSS2 antisense polynucleotides and 
transcriptional inhibitors that bind to the TMPRSS2 5' promoter/regulatory region. 

As used herein, the term "inhibhory amount" is intended to refer to the amount of 

10 an inhibitor necessary to effect a reduction of at least about 2-fold in the extent, amount 
or rate of transcription and/or protein synthesis and/or activity. 

As used herein, the term "reduced coenzyme" when used in reference to ARSDRl 
is intended to refer to a coenzyme that has been reduced during a dehydrogenation 
reaction mediated by .AKSDRl. During the dehydrogenation -reaction the substrate is 

15 oxidized by the removal of two hydrogen atoms from the substrate. One of the removed 
hydrogen atoms is directly transferred to the coenzyme, thereby reducing the coenzyme, 
for example, nicotinamide-adenine dinucleotide (NAD') to NADH or nicotinamide- 
adenine dinucleotide phosphate (NADP~) to NADPH. As used herein, the term "non- 
reduced coenzyme" is intended to refers to the ARSDRl coenzyme in its oxidized form, 

20 for example, NAD* or NADP^ 

As used herein, the term "substrate" when used in reference to ARSDRl is 
intended to refer to the non-oxidized state of a reactant that is known to become oxidized 
in an ARSDRl -catalyzed reaction. The term "product," when used in reference to 
ARSDRl as used herein is intended to refer to a reactant in an oxidized state that is the 

25 product of a dehydrogenation reaction catalyzed by ARSDRl . 
Isolation of Prostate-Soecific cDN A Clones 

The androgen-regulated prostate specific polynucleotide molecules of the present 
invention were identified by hybridization screening of prostate mRNA against a diverse 
population of prostate derived probes which were immobilized in a two-dimensional 

30 array. A complete description of the methods used for identification, cloning and 
sequencing of transcripts (SEQ ID NOs:l, 3, 5, and 7) are set forth in the Example 
sections corresponding to each of the referenced polynucleotides. 

In brief, two-dimensional microarrays containing a diverse set of prostate derived 
cDNAs were screened using RNA from a prostate cell line. A non-reduhdant set of 1500 

35 prostaie-derived cDNA clones was identified from the Prostate Expression Database. 
The inserts of the cDNAs were amplified with primers BL_ml3F 
(5"-GTAAAACGACGGCCAGTGAATTG-3') (SEQ ED N0:12) and DL_ml3R 
(5'-ACACAGGAAACAGCTATGACCATG-3') (SEQ ID NO: 13) utilizing PGR and 
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spotted onto glass microscope slides to form a microarray. To identify genes 
transcriptionally regulated by androgens, the microarrays of prostate derived cDNAs 
were screened using total RNA isolated from LNCap cells cultured for 72 hours either in 
the presence or absence of androgen. 
5 Hybridized microarray slides were scanned with an Array Scanner Generation II 

(Amersham, Piscataway, NJ). Intensity ratios for each cDNA clone hybridized with 
probes derived from androgen-stimuiated LNCaP and androgen-starved LNCaP were 
calculated (stimulated intensity/star\^ed intensity), A gene expression level change was 
treated as significantly different between the two conditions if all four replicate spots for 

10 a given cDNA demonstrated a ratio greater than 2 or less than '/b and the signal intensity 
was greater than 2 standard deviations above the image background. 

Microarray hybridization with androgen-stimuiated and androgen-starved LNCaP 
cDNA probes revealed four cDNAs, designated ARSDRl, TMPRSS2, P.ART-1 and 
clone 8C3 whose expression was consistently up-regulated using the above criteria. 

15 Sequence analysis and BLAST searches against the GenBank databases identified 
ARSDRl, PART-1 and 8C3 cDNAs as novel genes. Sequence database analysis of the 
TMPRSS2 cDNA revealed it to be identical to a previously identified serine protease 
gene that had been mistakenly designated as by expressed in a small intestine-specific 
fashion (Paoloni-Giacobino et al.. Genomics 44:309-329 (1997)). The polynucleotide 

20 and polypeptide sequences of the present inventive are generally described in the 
following sections corresponding to each of the prostate-specific, androgen-regulated 
polynucleotides of the present invention, 
ARSDRl 

ARSDRl is a multidoinain short-chain dehydrogenase/reductase that is androgen- 
25 regulated and predominantly expressed in prostate tissue. ARSDRl is predominantly 
transcribed into an about 2.5 kb mRNA transcript. A polynucleotide corresponding to 
the 5' promoter and regulatory region of the ARSDRl transcript was identified 
(Example 5). The ARSDRl promoter and regulatory region is about 3113 nucleotides in 
length and is set forth as nucleotides 1 to 3113 of SEQ ID NO:8. As described herein, 
30 ARSDRl was identified as a prostate specific and androgen regulated polynucleotide. 
Consistent with these functional characteristics, the nucleotide sequence of the ARSDRl 
promoter and regulatory region was found to include an androgen response element 
(ARE). The ARE in ARSDRl is located at nucleotides 2,246 to 2,259 and is an about 15 
nucleotide sequence with substantial similarity to consensus AREs (Roche et al,, Moi. 
35 Endocrinol. 6:2229-2235 (1992)) In addition, the nucleotide sequence of the .ARSDRl 
promoter and regulatory region was found to include two progesterone response elements 
(PREs) at nucleotides 2T75 to 2,189 and 2,627 to 2,641 of SEQ ID N0:8, respectively.' 
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The PREs are about 15 nucleotides in length with substantial similarity lo consensus 
PREs (Lieberman et al., MoJ, EndorcrinoL 7:515-527 (1993)). 

The promoter and regulator)^ region contains binding sites for various 
transcription and related regulatory factors. The domains carrying these binding sites are 
5 functional as binding agents in a variety of methods known in the an to inhibh or identify 
factors which bind to one or more of these domains in a sequence specific manner. 
These domains are also useful to constaict expression vectors which confer, for example, 
tissue specificity and androgen regulation. That is the ARSDRl pioinoler and regulatory 
region (as well as the other polynucleotide promoter regions of the present invention) can 

10 be use to make prostate-specific, androgen regulated expression vectors that comprising 
the inventive promoter operably linked to a heterologous nucleotide sequenc. Functional 
fragments of the AKSDRl promoter and regulatory region which independently exhibit 
one or more binding activities or other transcriptional activity of the full length sequence 
are therefore considered functional fragments of the promoter and regulatory region. 

15 Specific examples are portions of the ARSDRl nucleotide sequence containing the ARE 
sequence, set forth as nucleotides 2,246 to 2,259 in SEQ ID N0:8, and the PRE 
sequences set forth as nucleotides 2,175 to 2,189 and 2,627 to 2,641 in SEQ ID N0:8. 

ARSDRl is a member of the short-chain dehydrogenase/reductase (SDR) family 
of proteins. SDR are a large family of NAD(H) or NADP(H) dependent oxidoreductases. 

20 Members of the SDR family of proteins include many enzymes involved in steroid 
metabolism including, for example, estradiol 17-beta-dehydrogenase, human 15- 
hydroxyprostaglandin dehydrogenase, and' 1 1-beta-hydroxisteroid dehydrogenase 
(Jomvall et al.. Biochemistry M:SQ03-60\1>{\99'S)). Proteins belonging to the SDR 
family share amino acid residue identities of only 15-30%, indicating early evolutionary 

25 divergence. 

The ARSDRl polypeptide consists of about 318 amino acid residues having the 
sequence shown in SEQ ID N0:2, Two consensus sequences are conserved in the SDR 
family, the NAD(H) or NADP(H} binding domain, a N-terminal segment 
GlyXXXGLYXXGly (SEQ ID NO: 14), and the catal>tic domain, a sequence 

30 TyrXXXLys (SEQ ID N0:15). (Jomvall et al., supra, 1995; Ghosh et al., 
.V^n/czwre 2:629-640(1994)). The ARSDRl polypeptide contains both of these motifs 
conserved in the SDR family. ARSDRl also contains two Asn-glycosylation sites at 
amino acid positions 174 and 198 (SEQ ED N0:2) that are conserved among SDR family 
proteins. Another characteristic of ARSDRl is that it contains two ^ protein kinase C 

35 phosphorylation sites at amino acid positions 57 and 106 (SEQ ID NO:2). 
TMPRSS2 

TMPRSS2 is encoded by a transcript of about 4,650 nucleotides in length. A 
complete description of the methods used for identification, cloning and sequencing of 
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the full length transcript is set forth below in Examples 8-iI. The complete nucleotide 
sequence of the TMPRSS2 encoding transcript is shown in SEQ ID N0:3 and the 
deduced amino acid sequence is shown in SEQ ID N0:4. The full length transcript 
contains a 5* untranslated region (UTR) of 56 nucleotides and a 3' UTR consisting 
5 of 3, 115 nucleotides. 

A panial nucleotide sequence and deduced amino acid seqeuence has been 
published by Paoloni-Giacobino et al., supra. However, prostate specific expression has 
not previously been reported. The TMPRSS2 encoding polynucleotide sequence 
described herein extends the Paoloni-Giacobino et al. sequence by about 2,172 

10 nucleotides at the 3' terminus. The new TMPRSS2 3' UTR sequence is shown as 
nucleotides 914 to 3,118 in SEQ ID NO:10. In .cloning the fliil length transcript for 
TMPRSS2 a partial sequence was initially obtained from cDNA clone lOD 11 which 
consisted of 2,681 nucleotides in length. Clone lODl 1 begins about 286 nucleotides 5' to 
the translation stop codon of '1MPRSS2 and terminates about 723 nucleotides from the 3' 

15 end of the full length transcript (see SEQ ID N0:3). Therefore, clone lODll contains a 
region of 1,449 nucleotides at its 3' terminus that was not described previously by 
Paoloni-Giacobino et al. 

In addition to the TMPRSS2 fliil length transcript and fragments described above, 
a polynucleotide corresponding to the 5' promoter and regulator)' region was additionally 

20 isolated and sequenced. The method used for identifying this genomic sequence is 
described further below in Example 11. The isolated promoter/regulatory region is 
about 869 nucleotides in length and is set forth in SEQ ID NO: 9, As described herein, 
TMPRSS2 was identified as a prostate specific and androgen regulated polynucleotide. 
Consistent with these functional characteristics, the nucleotide sequence of the 

25 TMPRSS2 promoter/regulatory region was found to include an ARE. The ARE is 
located at nucleotides 576 to 590 of SEQ ID N0;9 and is an about 15 nucleotide 
sequence with substantial similarity to consensus AREs. 

The promoter/regulatory region contains binding sites for various transcription 
and related regulatory factors. The domains carrying these binding sites are functional as 

30 binding agents in a variety of methods known in the art to inhibit or identify factors 
which bind to one or more of these domains in a sequence specific manner. These 
domains are also useftil to construct expression vectors which confer, for example, tissue 
specificity and androgen regulation. Fragments of the TMPRSS2 promoter/regulatory 
region which independently exhibit one or more binding activities or other transcriptional 

35 activity of the full length sequence are therefore considered functional fragments of the 
promoter/regulatory region. A specific example is a TMPRSS2 polynucleotide fragment 
containing the ARE sequence set forth as nucleotides 576 to 590 of SEQ tD NO:9. 
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TMPRSS2 is a multidomain serine protease that is predominantly expressed in 
prostate tissue. The polypeptide consists of 492 amino acid residues in length and 
includes functional domains for serine protease activity, a scavenger receptor cysteine- 
rich domain, a LDL receptor class domam and a transmembrane domain. The serine 
5 protease domain extends from amino acid residue Arg255 to the carboxyl-terminus of the 
deduced polypeptide (SEQ ID N0:4). There is about 45-55% identity with several 
members of the serine protease family, including for example, human hepsin (GenBank 
accession number; X07002), human enterokinase (GenBank accession number: P98073) 
and human kallikrein (hk2) (GenBank accession number: P03952). The TMPRSS2 

10 protease domain is similar to the Si family of the SA clan of serine-type peptidases as 
described by RawHngs, N.D., and Barrel. A.J., Methods EnzymoL 244A9~6\ (1994). 
Chymotrypsin and trypsin are examples of members of this family of serine proteases. 

The TMPRSS2 active site residues have been identified as His296, Asp345, and 
Ser441 and cleavage specificity has been deduced to hydrolyze peptide bonds after Lys 

15 or Arg residues due to the presence of Asp435 at the base of the SI subsite which binds 
to the substrate (SEQ ID N0:4). TiVlPRSS2 contains nine conserved cysteine residues 
with the intersubunit disulfide bond between Cys758-Cys912 (SEQ ID N0:4) joining the 
catalytic protease subunit with the non-protease domains of the polypeptide. The amino- 
termina] He residue of the protease domain is included within the peptide sequence Arg- 

20 Ile-Val-Gly-Gly (RIVGG), which is characteristic for the proteolytic activator site of 
many serine protease zymogens (Rawiings and Barrett, supra). 

TMPRSS2 contains a hydrophobic sequence at amino acids 84-106 of SEQ ID 
N0:4 that is characteristic of a transmembrane domain (Hofmann. K., and Stoftel, W,, 
Bio/. Chem. Boppe-Seyler S41: 166 (1993). The transmembrane is not preceded by a 

25 peptide leader sequence, indicating that TMPRSS2 is a type II integral membrane 
proteins in which the amino terminus is located on the cytoplasmic side of the membrane 
(Parks, G-D., and Lamb, R.A., ^ ^/o/. Chem, 268:19101-19109(1993)). 

In addition to the transmembrane domain, TMPRSS2 contains a third region 
characteristic of a low-density lipoprotein receptor A domain (LDLRA domain). This 

30 domain extends from" Cysl 13 to Cysl48 in TMPRSS2 (SEQ ID N0:4). A characteristic 
LDLRA domain is about 40 amino acids long and contains 6 disulfide-bonded cysteines 
(Sudhoff et al., 228:815-822 (1985)). These cysteines have been identified in 

TMPRSS2 as amino acid residues 1 13, 120, 126, 133, 139, and 148 (SEQ ID N0:4). 

Finally, TMPRSS2 also contains a scavenger receptor cysteine-rich domain 

35 (SRCR). SRCR domains characteristically are about 100 amino acids long and rich in 
cysteine (Resnick et al., Trends Biochem. Sci. 19:5-8 (1994)). The SRCR domain of 
TMPRSS2 corresponds to amino acid residues Vall49 to Leu242 (SEQ ID N0:4). The 
SRCR domain of TMPRSS2 contains a consensus sequence characteristic of group A 
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SRCR. Proteins with SRCR domains are known to be expressed either on the cell 
surface or secreted into plasnna or other body fluids. 
PART-1 

PART-1 is an androgen-regulated polypeptide that is predominantly expressed in 
5 prostate tissue. The PAJRT-1 polypeptide consists of 60 amino acid residues and has two 
protein kinase phosphorylation sites as well as one tyrosine kinase phosphorylation site. 
The P/*lRT-1 polypeptide is encoded by an approximately 2.1 kb messenger RNA (SEQ 
ID N0:2). The PART-1 promoter region (SEQ ID N0:3) is approximately 2.0 kb in size 
and contains a binding site for the homeo-domain containing protein Pbx-ia at 
10 nucleotides 536 to 544 of SEQ ID NO: 12 as described by Van Dijk et al., Proc, Natl. 
Acad. Sci.^ 90:6061-6065 (1993). The nucleotide sequence corresponding to the PART-1 
cDNA combined with a portion of its 5' promoter and regulator^' region has been 
described as clone 14D7. The nucleotide sequence of this clone therefore is a composite 
sequence of the about 2,106 nucleotide PART-1 cDNA and the about 603 nucleotides' 
15 promoter and regulatory region of the PART-1 transcription unit. 

Also provided is an isolated PART-1 polynucleotide having the nucleotide 
sequence shown as SEQ ID N0:2. In addition to the PART-1 full length transcript and 
fragments described above, a polynucleotide corresponding to the 5' promoter and 
regulatory region was isolated. The method used for identifying the nucleotide sequence 
20 of the 5' promoter and regulatory region is described further below in the Example 15. 
The isolated promoter/regulatory region polynucleotide is about 1969 nucleotides in 
length and is set forth in SEQ ID N0:1 1. As described herein, PART-1 was identified as 
a prostate-specific and androgen regulated transcript. The PART-1 Pbx-la binding site 
region is shown in SEQ ID NO: 1 1 as nucleotides 536 to 544. 
25 The PART- 1 promoter/regulatoo' region contains binding sites for various 

transcription and. related regulatory factors. The domains carrying these binding sites are 
functional as binding agents in a variety of methods known in the art to inhibit or identify 
factors which bind to one or more of these domains in a sequence specific manner. 
These domains are also useful to construct expression vectors which confer, for example, 
30 tissue specificity and androgen regulation. Fragments of the PART-1 
promoter/regulatory region which independently exhibit one or more binding activities or 
other transcriptional activity of the full length sequence are therefore considered 
functional fragments of the promoter/regulatory region. A specific example is a PART-1 
nucleic acid fragment containing the Pbx-la binding site set forth in SEQ ID NO: 11 as 
35 nucleotides 536 to 544. 
8C3 

Identification, charactenzation, cloning, and chromosomal localization of 
cDNA 8C3 was performed according to essentially the same methods described for 
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ARSDRl, TMPRSS2 and PART- 1 above and in Examples 1-16. SC3 is an 
androgen-reyulated transcript of about 4,500 nucleotides in length that is predominantly 
expressed in prostate tissue. The 8C3 polynucleotide was identified by hybridization 
screening of prostate niRNA against a diverse population of prostate derived probes 
5 which were immobilized in a two-dimensional array. A complete descriprion of the 
methods used for identification, cloning and sequencing of the 8Cj polynucleotide is set 
forth below in the Example 1. The 8Cj cDNA has been mapped to 16Q24, a region of 
the human genome that has been shown lo expericjice a high incidence of chromosomal 
loss in advanced prostate cancer Therefore, it is likely that 8C3 is a tumor suppressor. 

10 The nucleotide sequence of the 8C3 encoding transcript is shown in SEQ ID N0;7. 

TheARSDRl, TMPRSS2, 8C3 and PART-1 polynucleotide and polypeptide 
sequences of the invention are collectively referred to herein as the polynucleotides and 
polypeptides of the invention, respectively, 

In one aspect of the present invention, substantially pure polynucleotides are 

15 provided that are capable of hybridizing under stringent conditions to at least 15 
contiguous nucleotides of the nucleotide sequences shown as SEQ ID NOS: 1, 3, 5 and 1, 
or complementary sequences thereof 

In another aspect of the invention, substantially pure polynucleotides having 
substantially the nucleotide sequences shown as SEQ ID N0S:1, 3, 5, 7, 8, 9, 10 and 1 1, 

20 or functional fragments thereof are also provided. Functional fragments of SEQ ID 
NOS: 8, 9 and 11 may contain, for example, a 5' promoter or a transcription regulatory 
region, such, as for example, an androgen' regulatory element. The promoter and 
regulatory regions of the present invention contain binding sites for various transcription 
and related regulatory factors. The domains carrying these binding sites are functional as 

25 binding agents in a variety of methods known in the an to inhibit or identify factors 
which bind to one or more of these domains in a sequence specific manner. These 
domains are also useful to construct expression vectors which confer, for example, tissue 
specificity and androgen regulation. Functional fragments of the ARSDRl, TMPRSS2 
and PART-1 promoter and regulatory regions which independently exhibit one or more 

30 binding activities or other transcriptional activity of the fijll length sequence are therefore 
considered functional fragments of the inventive promoter and regulatory regions. 

All of the polynucleotides described above, and fragments thereof are useful as 
hybridization probes in diagnostic procedures. The probes can be as long as the full 
length transcript or as short as about 10-15 nucleotides, and preferably about 15-18 

35 nucleotides. They can correspond to coding region or untranslated region sequence. The 
particular application and degree of desired specificity will be one consideration well 
known to those skilled in the art in selecting a probe. For e.xample, if it is desired lo 
detect an mRNA encoding one of the prostate-specific polypeptides of the present 
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invention or other related species, the user can choose coding sequence probes and low 
stringency hybridization conditions. Ahernatively, using high stringency conditions with 
the same probe will select only polynucleotides that actually encode the referenced 
inventive polypeptide. Untranslated region sequences are useful regions to construct 
5 probes since there is little evolutionar)' pressure to conserve non-coding domains. 
However, probes as small as 15 nucleotides are statistically unique sequences within the 
human genome. Therefore, fragments of the inventive sequences that are generally of 1 5 
nucleotides or more in length can be constructed from essentially ajiy region of the 
transcript or promoter and regulatory region and be capable of uniquely hybridizing to 

10 ARSDRl, TMPRSS2, PART-1 and 8C3 polynucleotides. 

The probes can be produced recombinantly or chemically synthesized using 
methods well known in the art. Additionally, ARSDRl. T.VfPRSS2, P.^T-] and 8C3 ' 
hybridization probes can be labeled with a variety of detectable labels including, for 
example, radioisotopes, fluorescent tags, reporter enzymes, biotin and other ligands. 

15 Such detectable labels can additionally be coupled with, for example, colorimetric or 
photometric indicator substrate for specirophotometric detection. Methods for labeling 
and detecting such probes are well known in the art and can be found described in, for 
example, Sambrook et al.. Molecular Cloning: A Laboratory Manual , 2nd ed., Cold 
Spring Harbor Press, Plainsview, New York (1989), and Ausubel et al., Current 

20 Protocols in Molecular Biology (Supplement 47), John Wiley & Sons, New York ( 1 999). 

Therefore, the invention funher provides a substantially pure polynucleotide 
probe having substantially the nucleotide sequence of SEQ ID N0S:1, 3, 5, 7, 8, 9, 10 
and 1 1, or fragment thereof A fragment of the above referenced polynucleotide probes 
having substantially the sequence of SEQ ID N0S:1, 3, 5, 7, 8, 9, 10 and 11 can, for 

25 example, be an oligonucleotide of about 15-18 nucleotides in length. 

In another aspect, the present invention is directed to isolated prostate-specific 
polypeptides (such as polypeptides encoded by the polynucleotide molecules of the 
present invention) that are androgen regulated. The polypeptides of the present invention 
can be isolated, for example, by incorporating a polynucleotide molecule of the invention 

30 (such as a cDNA molecule) into an expression vector, introducing the expression vector 
into a host cell and expressing the polynucleotide molecule to yield polypeptide. The 
polypeptide can then be purified by art-recognized means. When a crude polypeptide 
extract is initially prepared, it may be desirable to include one or more, proteinase 
inhibitors in the extract. Representative examples of proteinase inhibitors include: serine 

35 proteinase inhibitors (such as phenylmethylsulfony! fluoride (PMSF), benzamidc, 
benzamidine HCI, e-Amino-/7-caproic acid and aprotinin (Trasylol)); cysteine proteinase 
inhibitors, such as sodium /?-hydroxymercuribenzoate; competitive proteinase inhibitors, 
such as antipain and leupeptin; covalent proteinase inhibitors, such as iodoacetaie and 
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A^-eThy!maleimide; aspanate (acidic) proteinase inhibitors, such as pepstatin and 
diazoacetylnorleucine methyl ester (D.AJ\'); metailoproteinase inhibitors, such as EGTA 
[ethylene glycol bis(3-aniinoethyl ether) AVV,.V'//'-tetraacetic acid], and the chelator 1, 
lO-phenanthroline. 

5 In another aspect, the present invention is directed to antibodies that bind 

specifically to the prostate-specific polypeptides ARSDRl, TMPRSS2, P.\RT-1 or 
polypeptide fragments thereof By way of representative example, antigen useful for 
raising antibodies can be prepared in the following manner. A full-length cDNA 
molecule of the present invention (or a cDNA molecule of the invention that is not flill- 

10 length, but which includes a coding region encoding an antigenic polypeptide) can be 
cloned into a plasmid vector, such as a Bluescript plasmid (available from Stratagene, 
Inc., La Jolla, California). The recombinant vector is then introduced into an £. coli * 
strain (such as coli XLl-Blue, also available from Stratagene, Inc.) and the 
polypeptide encoded by the cDNA is expressed in E. coli and then purified. For 

15 example, E. coli XLl-Blue harboring a Bluescript vector including a cDNA molecule of 
interest can be grown overnight at 37°C in LB medium containing 100 j.ig ampicillin/ml. 
A 50 \\\ aliquot of the overnight culture can be used to inoculate 5 ml of fresh LB 
medium containing ampicillin, and the culture grown at 37°C with vigorous agitation to 
A^OO ^ before induction with I mJVI IPTG. After an additional two hours of growth, 

20 the suspension is centrifuged (1000 x ^, 15 min, 4°C), the media removed, and the 
pelleted cells resuspended in 1 ml of cold buffer that preferably contains 1 mM EDTA 
and one or more proteinase inhibitors, such as-those described herein in connection with 
the purification of the isolated polypeptides of the present invention. The cells can be 
disrupted by sonication with a microprobe. The chilled sonicate is cleared by 

25 centrifligation and the expressed, recombinant polypeptide purified from the supernatant 
by art-recognized protein purification techniques, such as those described herein. 

Alternatively, polypeptide fragments of the inventive proteins can be prepared 
using peptide synthesis methods that are well known in the art. The synthetic 
polypeptide fragment can then be used to prepare antibodies that are specific to any one 

30 of the proteins of the present invention. Direct peptide synthesis using solid-phase 
techniques (Stewart et al., Solid-Phase Peptide Synthesis, W H Freeman Co, San 
Francisco Calif (1969), Merrifield, J. Am. Chem. Soa 85:2149-2154 (1963)) is an 
alternative to recombinant or chimeric peptide production. Automated synthesis may be 
achieved, for example, using Applied Biosystems 431 A Peptide Synthesizer (Foster City, 

35 Calif) in accordance with the instructions provided by the manufacturer. Additionally 
the polypeptide sequences of the present invention or any fragment thereof may be 
mutated during direct synthesis and, if desired, combined using chemical methods with 
other amino acid sequences. The polypeptides used to induce specific antibodies may 
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have an amino acid sequence consisting of at least five amino acids and preferably at 
least 10 amino acids. Short stretches of amino acid sequence may be attached with those 
of another polypeptide, and the chimeric polypeptide used for antibody production. 
Alternatively, the polypeptide may be of sufficient length to contain an entire domain for 

5 antibody recognition. 

Representative examples of an-rccognized techniques for purifying, or partially 
purifying, polypeptides from biological material are exclusion chromatography, 
ion-exchange chromatography, hydrophobic interaction chromatography, reversed-phasc 
chromatography and immobilized metal affinity chromatography. 

10 Hydrophobic interaction chromatography and reversed-phase chromatography are 

two separation methods based on the interactions between the hydrophobic moieties of a 
sample and an msoluble, immobilized hydrophobic group present on the chromatography 
matrix. In hydrophobic interaction chromatography the matrix is hydrophilic and is 
substituted with short-chain phenyl or octyl nonpolar groups. The mobile phase is 

15 usually an aqueous sah solution. In reversed phase chromatography the matrix is silica 
that has been substituted with longer //-alkyl chains, usually C? (octylsilyl) or Cis 
(octadecylsilyl). The matrix is less polar than the mobile phase. The mobile phase is 
usually a mixture of water and a less polar organic modifier. 

Separations on hydrophobic interaction chromatography matrices are usually 

20 done in aqueous salt solutions, which generally are nondenaturing conditions. Samples 
are loaded onto the matrix in a high-salt buffer and elution is by a descending salt 
gradient. Separations on reversed-phase media are usually done in mixtures of aqueous 
and organic solvents, which are often denaturing conditions. In the case of polypeptide 
and/or peptide purification, hydrophobic interaction chromatography depends on surface 

25 hydrophobic groups and is carried out under conditions which maintain the integrity of 
the polypeptide molecule. Reversed-phase chromatography depends on the native 
hydrophobicity of the polypeptide and is carried out under conditions which expose 
nearly all hydrophobic groups to the matrix, /.e., denaturing conditions. 

Ion-exchange chromatography is designed specifically for the separation of ionic 

30 or ionizable compounds. The stationary phase (column matrix material) carries ionizable 
fiinctiona] groups, fixed by chemical bonding to the stationary phase. These fixed 
charges carry a counterion of opposite sign. This counterion is not fixed and can be 
displaced. Ion-exchange chromatography is named on the basis of the sign of the 
displaceable charges. Thus, in anion ion-exchange chromatography the'fixed charges are 

35 positive and in cation ion-exchange chromatography the fixed charges are negative. 

Retention of a molecule on an ion-exchange chromatography column involves an 
electrostatic interaction between the fixed charges and those of the molecule, binding 
involves replacement of the nonfixed ions by the molecule. Elution, in turn, involves 
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displacement of the molecule from the fixed charges by a new counterion with a greater 
affinity for the fixed charges than the molecule, and which then becomes the new, 
nonfixed ion. 

The ability of counterions (salts) to displace iiiolecules bound to fixed charges is a 
5 function of the difference in affinities between the fixed charges and the nonfixed 
charges of both the molecule and the sah. Affmities in turn are affected by several 
variables, including the magnitude of the net charge of the molecule and the 
concentration and type of salt used for displacement. 

Solid-phase packings used in ion-exchange chromatography include cellulose, 

10 dextrans, agarose, and polystyrene. The exchange groups used include DEAJE 
(diethylaminoethyl), a weak base, that will have a net positive charge when ionized and 
will therefore bind and exchange anions; and CM (carboxymethyl), a weak acid, with a 
negative charge when ionized that will bind and exchange cations. Another form of 
weak anion exchanger contains the PEl (polyethyieneimine) functional group. This 

15 material, most usually found on thin layer sheets, is useful for binding polypeptides at pH 
values above their pi. The polystyrene matrix can be obtained with quaternary 
ammonium functional groups for strong base anion exchange or with sulfonic acid 
functional groups for strong acid cation exchange. Intermediate and weak ion-exchange 
materials are also available. Ion-exchange chromatography need not be performed using 

20 a column, and can be performed as batch ion-exchange chromatography v/ith the slurry 
of the stationary phase in a vessel such as a beaker. 

Gel filtration is performed using porous' beads as the chromatographic support. A 
column constructed from such beads will have two measurable liquid volumes, the 
external volume, consisting of the liquid between the beads, and the internal volume, 

25 consisting of the liquid within the pores of the beads. Large molecules will equilibrate 
only with the external volume while small molecules will equilibrate with both the 
external and internal volumes. A mixture of molecules (such as proteins) is applied in a 
discrete volume or zone at the top of a gel filtration column and allowed to percolate 
through the column. The large molecules are excluded from the internal volume and 

30 therefore emerge first from the column while the smaller molecules, which can access the 
internal volume, emerge later. The volume of a conventional matrix used for protein 
purification is typically 30 to 100 times the volume of the sample to be fractionated. The 
absorbance of the column effluent can be continuously monitored at a desired 
wavelength using a flow monitor. 

35 A technique that is often applied to ihe purification of polypeptides is High 

Performance Liquid Chromatography (HPLC). HPLC is an advancement in both the 
operational theory and fabrication of traditional chromaTOgraphic systems. HPLC 
systems for the separation of biological macromolecules vary from the traditional column 
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chromatographic systems in three ways; (])ihe coiumn packing materials are of much 
greater mechanical strength, (2) the parricie size of the column packing materials has 
been decreased 5- to 10-fold to enhance adsorption-desorption kinetics and diminish 
bandspreading, and (3) the columns are operated at 10-60 times higher mobile-phase 
5 velocity. Thus, by way of non-limiting example, HPLC can utilize exclusion 
chromatography, ion-exchange chromatography, hydrophobic interaction 
chromatography, reversed-phase chromatography and immobilized metal affinity 
chromatography. Art-recognized techniques for the purification of proteins and peptides 
are set forth in Methods in Enzymology, Vol. 182, Guide to Protein Purification, Murray 
10 P. Deutscher ed (1990). In panicular. Section IV, chapter 14, of the Deutscher 
publication discloses representative techniques for the preparation of protein extracts 
from plant material. 

Methods for preparing monoclonal and polyclonal antibodies are well known to 
those of ordinary skill in the art and are set forth, for example, in chapters five and six of 

15 Antibodies A Laboratory Manual, E. Harlow and D. Lane, Cold Spring Harbor 
Laboratory (19S8). In one representative example, polyclonal antibodies specific for a 
purified protein can be raised in a New Zealand rabbit implanted with a whiffle ball. One 
lag of protein is injected at intervals directly into the whiffle ball granuloma. A 
representative injection regime is injections (each of 1 ug protein) at day 1, day 14 and 

20 day 35. Granuloma fluid is withdrawn one week prior to the first injection (preimmune 
serum), and forty days after the fmal injection (postimmune serum). 

An antibody is specific for one of the inventive proteins if it is produced against 
an epitope of the polypeptide and binds to at least part of the natural or recombinant 
polypeptide. Antibody production includes not only the stimulation of an immune 

25 response by injection into animals, but also analogous processes such as the production 
of synthetic antibodies, the screening of recombinant immunoglobulin libraries for 
specific-binding molecules (Orlandi et al., Proc. Nati Acad. Sci. 86:3833-3837 (1989) , 
or Huse et al. Science 256: 1275-1281 (1989) ), or the in vitro stimulation of lymphocyte 
populations. 

30 Current technology (Winter and Milstein, Nature 349:293-299 (1991)) provides 

for a number of highly specific binding reagents based on the principles of antibody 
formation. These techniques may be adapted to produce molecules which specifically 
bind to the inventive proteins or fragments thereof Antibodies or other appropriate 
molecules generated against a specific immunogenic peptide fragment or oligopeptide 

35 can be used in Western analysis, enzyme-linked immunosorbent assays (ELISA) or 
similar tests to establish the presence of or to quamitate amounts of any one the inventive 
proteins in normal, diseased, or therapeutically treated cells or tissues. Variations on any 
procedure known in the art for the measurement of protein can be used in the practice of 
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the instant invention. Such procedures include but are not limited to competitive and 
non-competitive assay systems using techniques such as radioimmunoassays, ELISA 
(enzyme linked immunoabsorbent assay), sandwich immunoassays, agglutination assays, 
complement fixation assays, immunoradiometric assays, fluorescent immunoassays, 
5 protein A immunoassays, immunoelectrophoresis assays and the like. 

The invention also provides a method of diagnosing or predicting the 
susceptibility of a prostate neoplastic condition in an individual suspected of having a 
neoplastic condition of the prostate. The method comprises: 

(a) obtaining a fluid sample from an individual; 
10 (b) determining the expression level of a polypeptide selected from the group of 

polypeptides whose amino acid sequences are shown in SEQ ID NOS: 2. 4, and 6, or a 
polynucleotide selected from the group of polynucleotides whose nucleotide sequences 
are shown in SEQ ID NOS: 1, 3, 5, 7, 8, 9, iO, and 11; and 

(c) comparing said determined expression level of polypeptide or polynucleotide 
15 to a corresponding polypeptide or polynucleotide expression level from a normal fluid 
sample wherein said measured expression level for said polypeptide or polynucleotide 
of 2-fold or more from said fluid sample fi-om said individual as compared to said normal 
fluid sample indicates the presence of a prostate neoplastic condition. 

In another aspect of the invention, a method of diagnosing or predicting the 
20 susceptibility of a prostate neoplastic condition in an individual suspected of having a 
neoplastic condition of the prostate is provided. The method is performed by: 

(a) obtaining a prostate cell sample of a individual; 

(b) determining the expression level of a polypeptide seleaed from the group of 
polypeptides whose amino acid sequences are shown in SEQ ID NOS: 2, 4, and 6, or a 

25 , polynucleotide selected from the group of polynucleotides whose nucleotide .sequences 
are shown in SEQ ID NOS: I, 3, 5, 7, 8, 9, 10, and U, and 

(c) comparing the determined expression levels of at least one polypeptide or 
polynucleotide to a corresponding polypeptide or polynucleotide expression level from 
normal prostate cells or from an androgen-dependent cell line, wherein the measured 

30 expression level for said polypeptide or polynucleotide of2-foId or more from the 
individual compared to normal prostate cells or from an androgen-dependent cell line 
indicates the presence of a prostate neoplastic condition. 

A prostate neoplastic condition is a benign or malignant prostate lesion of 
proliferating cells. Prostate neoplastic conditions include, for example, prostate 

35 intraepithelial neoplasia (PIN) and prostate cancer. Prostate cancer is an uncontrolled 
proliferation of prostate cells which can invade and destroy adjacent tissues as well , as 
metastasize. Primary prostate tumors can be classified into stages TX, TO, Tl, T2, T3, 
and T4 and metastate tumors can be classified into stages Dl, D2 and D3. Similarly, 
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there are classifications known by those skilled in the art for the progressive stages of 
precancerous lesions or PIN. The methods herein are applicable for the diagnosis or 
treatment of any or all stages of prosiate neoplastic conditions. 

The methods of the invention are also applicable to prostate pathologies other 
5 than neoplastic conditions. Such other pathologies include, for example, benign prostatic 
hyperplasia (BPH) and prostatitis. BPH is one of the most cornnioii diseases in adult 
males. Histological evidence of BPH has been found in more than 40% of men in their 
fifties and almost 90% of men in their eighties. The disease results from the 
accumulation of non-malignant nodules arising in a small region around the proximal 

10 segment of the prostatic urethra which leads to an increase in prostate volume. If left 
untreated, BPH can result in acute and chronic retention of urine, renal failure secondary 
to obstructive uropathy, serious urinary tract infection and irreversible bladder 
decompensation. Prostatitis is an infection of the prostate. Other prostate pathologies 
known to those skilled in the art exist as well and are similarly applicable for diagnosis or 

15 treatment using the methods of the invention. Various neoplastic conditions of the 
prostate as well as prostate pathologies can be found described in, for example, 
Campbell's Urologv , Seventh Edition, W.B. Saunders Company, Philadelphia (1998). 
Therefore, the methods of the invention are applicable to both prostate neoplastic 
conditions and prostate pathologies. 

20 The invention provides a method of diagnosing or predicting prostate neoplastic 

conditions based on the finding of a positive correlation between ARSDRl, TMPRSS2, 
PART-1 and 8C3 polypeptide or polynucleotide expression in neoplastic cells of the 
prostate and the degree or extent of the neoplastic condition or pathology. The diagnostic 
methods of the invention are applicable to numerous prostate neoplastic conditions and 

25 pathologies as described above. One consequence of progression into these neoplastic 
and pathological conditions is an increased expression of at least one of the ARSDRl, 
TMPRSS2, PART-1 and 8C3 polypeptides or polynucleotides in prostate tissue as well 
as secretion into the circulatory system and urine. The increase in at least one of 
ARSDRl, TMPRSS2, P.AR.T-1 and 8C3 expression in individuals suffering from a 

30 prostate neoplastic condition can be measured by comparing the amount of ARSDRl, 
TMPRSS2, PART-1 or 8C3 mRNA and/or polypeptide to that found, for example, in 
normal prostate tissue samples or in normal blood or serum samples. A two-fold or more 
increase in polypeptide or polynucleotide expression in a prostate cell sample relative to 
samples obtained from normal prostate cells or from an androgen-depehdent cell line is 

35 indicative of a prostate neoplastic condition or pathology, Similarly, an increase in at 
least one of ARSDRl, TMPRSS2, PART-1 and 8C3 polypeptide or polynucleotide 
expression leading to two-fold or more secretion of polypeptide into the blood or other 
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circulatory fluids of the individuai compared to norma! blood or fluid samples also is 
indicative of a prostate neoplastic condition or pathology. 

As a diagnostic indicator, the polypeptide or polynucleotide of the present 
invention can be used qualitatively to positively identify a prostate neoplastic condition 
5 or pathology as described above. Alternatively, the inventive reagents also can be used 
quantitiatively to determine the degree or susceptibility of a prostate neoplastic condition 
or pathology. For example, successive increases in the expression levels of at least one 
of AKSDRl, TMPRSS2, PART-1 or 8C3, including levels of secreted polypeptide in 
circulating fluids and urine, can be used as a predictive indicator of the degree or severity 

10 of a prostate neoplastic condition or pathology because increased expression, leading to a 
rise in accumulated levels, for example, also positively correlates with increased severity 
of a neoplastic condition of the prostate, The higher the level of expression of any one of* 
ARSDRl, TMPRSS2, PART-1 or 8C3, the later the stage of the prostate neoplastic 
condition or pathology. Tor example, increases in expression levels of two-fold or more 

15 compared to a normal sample is indicative of at least prostate neoplasia. The inventive 
polypeptide or polynucleotide probes also can be used quantitatively to distinguish 
between pathologies and neoplastic conditions as well as to distinguish between the 
different types of neoplastic conditions. 

Correlative increases can be determined by comparison of expression of at least 

20 one of ARSDRl, TMPRSS2, PART-1 or 8C3 from the individual having, or suspected of 
having a neoplastic condition of the prostate to expression levels of the corresponding 
polypeptide or polynucleotide from known samples determined to exhibit a prostate 
neoplastic condition. Alternatively, correlative increases also can be determined by 
comparison of expression of at least one of ARSDRl, TiMPRSS2, PART-1 or 8C3 from 

25 the test individual to expression levels of other known markers of prostate cancer such as 
prostate specific antigen (PSA), glandular kaliikrein 2 (hK2) and prostase/P.RSSlS. 
These other known markers can be used, for example, as an internal or external standard 
for correlation of stage-specific expression with increases in expression of any one of 
ARSDRl, TMPRSS2, PART-1 or 8C3 and severity of the neoplastic or pathological 

30 condition. Conversely, a regression in the severity of a prostate neoplastic condition or 
pathology is followed by a corresponding decrease in expression levels of at least one of 
ARSDRl, TMPRSS2, PART-1 or 8C3 and can similarly be assessed using the methods 
described above 

Given the teachings and guidance provided herein, those skilled in the art will 
35 know or can determine the stage or severity of a prostate neoplastic condition or 
pathology based on a determination of expression levels for of at least one of ARSDRl, 
TMPRSS2, PART-1 or SC3 polypeptides and/or polynucleotides and using known 
procedures and marker comparisons other than those described above. For a review of 
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recognized values for such other marker in normal versus pathological tissues, see for 
example, Campbell's Urology . Seventh Edition, W.B. Saunders Company, Philadelphia 
(1998). 

Therefore, the invention provides a method for both diagnosing and prognosing a 
5 prostate neoplastic condition including prostate cancer and prostate interepithelial 
neoplasia as well as other prostate pathologies such as BPH and prostatitis. 

The use of at least one of .AJ^SDRl, T^1PRSS2, PART-1 or SC3 expression 
levels in prostate cells, the circulatory system and urine as a diagnostic indicator of a 
prostate pathology allows for early diagnosis as a predictive indicator when no 

10 physiological or pathological symptoms are apparent. The methods are applicable to any 
males, generally those over age 50, African- American males and males with familial 
history of prostate neoplastic conditions or pathologies. The diagnostic methods of the 
invention also are applicable to individuals predicted to be at risk for prostate neoplastic 
conditions or pathologies by reliable prognostic indicators prior to onset of overt clinical 

15 symptoms. All that is necessary is to determine the expression level of at least one of 
ARSDRl, TMPRSS2, PART-1 or 8C3 in prostate tissue or circulatory or bodily fluid to 
dietermine whether there is an increase in these polypeptide or polynucleotide levels in 
the individual suspected of having a prostate pathology compared to normal individuals. 
Those skilled in the art will know, or be able to determine, by using routine examinations 

20 and practices in the field of medicine, those individuals who are applicable candidates for 
diagnosis by the methods of the invention. 

For example, individuals suspected of having a prostate neoplastic condition or 
pathology can be identified by exhibiting presenting signs of prostate cancer which 
include, for example, a palpable nodule (which generally occurs in greater than 50% of 

25 the cases), dysuria, cystitis and prostatitis, frequency, urinary retention, or decreased 
urine stream. Signs of advanced disease include pain, uremia, weight loss and systemic 
bleeding. Prognostic methods of this invention are applicable to individuals after 
diagnosis of a prostate neoplastic condition, for example, to monitor improvements or 
identify residual neoplastic prostate cells using, for example, imaging methods known in 

30 the art and which targets at least one of ARSDRl, TMPRSS2, PART-1 or 8C3 
polypeptides or polynucleotides. 

Therefore, the invention provides a method of predicting the onset of a prostate 
neoplastic condition or pathology. The method consists of determining increased 
expression levels of at least one of ARSDRl, TMPRSS2, PART-1 and SC3 in a prostate 

35 cell sample or in fluids from an individual having or suspected of having a prostate 
neoplastic condition or pathology compared to a sample isolated from a normal 
individual, where increased expression in the sample indicates the onset of the prostate 
neoplastic condition or pathology. 
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The diagnostic methods of the invention are applicable for use with a variety of 
different types of samples isolated or obtained from an individual having, or suspected of 
having a prostate neoplastic condition or prostate pathology. For example, samples 
applicable for use in one or more diagnostic formats of the invention, include tissue and 
5 cell samples. A tissue or cell sample can be obtained, for example, by biopsy or surgery. 
As described below, and depending on the format of ihe method, the tissue can be used 
whole or subjected to various methods known in the an to disassociate the sample into 
smaller pieces, cell aggregrates or individual cells. Additionally, when combined with 
amplification methods such as polymerase chain reaction (PGR), a single prostate cell 

10 sample is sufficient for use in diagnostic assays of the invention which employ 
hybridization detection methods. Similarly, when measuring levels of any one of 
ARSDRl, TNIPRSS2, and PART-1 polypeptide or activity levels, amplification of the 
signal with enzymatic coupling or photometric enhancement can be employed using only 
a few or a small number of cells. 

15 Whole tissue obtained from a prostate biopsy or surgery is one example of a 

prostate cell sample. Whole tissue prostate ceil samples can be assayed employing any 
of the formats described below. For example, the prostate tissue sample can be mounted 
and hybridized i?i sitii with a polynucleotide probe of the present invention. Similar 
histological formats employing protein detection methods and />? sihi activity assays also 

20 can be used to detect polypeptides of the invention in whole tissue prostate cell samples. 
Polypeptide detection methods include, for example, staining with antibodies specific for 
at least one of the inventive polypeptides and activity assays which result in the 
deposition of an ARSDRl, TMPRSS2, or PART-1 end product at the site of enzyme 
activity in the sample. Such histological methods as well as others are well known to 

25 those skilled in the art and are applicable for use in the diagnostic methods of the 
invention using whole tissue as the source of a prostate cell sample. Methods for 
preparing and mounting the samples are similarly well known in the art. 

Individual prostate cells and cell aggregates from an individual having, or 
suspected of having a prostate neopla.';tic condition or pathology is another example of a 

30 prostate ceil sample which can be analyzed for increased expression of ARSDRl, 
TMPRSS2, PART-1 or 8C3, polypeptide or polynucleotide or activity. The cells can be 
grown in culture and analyzed in situ using procedures such as those described above. 
The expression level can be determined by, for example, binding agents specific for 
ARSDRl, TMPRSS2, orP.ART-1 polypeptides, or by hybridization to a probe specific to 

35 at least one of ARSDRl, TMPRSS2, PART-1 and 8C3 polynucleotides. Other methods 
for measuring the expression level of the inventive polypeptides or polynucleotides in 
whole cell samples are known in the art and are similarly applicable in any of the 
diagnostic formats described below. 
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The tissue or whole cell prostate ceil sample obtained from an individual also can 
be analyzed for increased expression of at least one of .AJISDRI, TMPRSS2, PART-1 or 
8C3 by lysing the cell and measuring the expression levels of an inventive polypeptide or 
polynucleotide in the lysate, a fractionated portion thereof or a purified component 
5 thereof using any of diagnostic formats described below. For example, if a hybridization 
format is used, RNA from one or more of the inventive polynucleotides can be amplified 
directly from the lysate using PGR, or other amplification procedures well known in the 
art such as RT-PCR, 5' or 3' RACE lo directly measure the expression levels of at least 
one of ARSDRl, TMPRSS2, PART-1 or 8C3. RNA also can be isolated and probed 

10 directly such as by solution hybridization or indirectly by hybridization to immobilized 
RNA. Similarly, when determining the expression level of the polypeptides of the 
invention using polypeptide detection or enzyme activity formats, lysates can be assayed " 
directly, or they can be funher fractionated to enrich for the inventive polypeptides and 
their corresponding activities. Numerous other methods applicable for use with various 

15 cell fractions are well known to those skilled in the art and can accordingly be used in the 
methods of the invention. 

The prostate tissue or cell sample can be obtained directly from the individual or, 
alternatively, it can be obtained form other sources for testing. Similarly, the cell sample 
can be tested when it is freshly isolated or it can be tested following short or prolonged 

20 periods of cryopreservaiion without substantial loss in accuracy or sensitivity. If the 
sample is to be tested following an indeterminate period of time, it can be obtained and 
then cryopreserved, or stored at 4''C for short periods of time, for example. An 
advantage of the diagnostic methods of the invention is that they do not require 
histological analysis of the sample. As such, the sample can be initially disaggregated, 

25 lysed, fractionated or purified and the active component stored for later diagnosis. 

The diagnostic methods of the invention are applicable for use with a variety of 
different types of samples other than prostate cell samples. For example, intracellular 
polynucleotides and polypeptides of the invention may leak into the extracellular space 
when a neoplastic prostate condition causes a disniption of the normal prostate 

30 architecture. Therefore, the diagnostic methods of the invention are applicable with fluid 
samples collected from an individual having, or suspected of having a neoplastic 
condition of the prostate or a prostate pathology. 

Fluid samples which can be measured for ARSDRl, TMPRSS2, PART-1 or 8C3 
expression levels include, for example, blood, serum, lymph, urine and semen. Other 

35 bodily fluids are known to those skilled in the art and are similarly applicable for use as a 
sample in the diagnostic methods of the invention, One advantage of analyzing fluid 
samples is that they are readily obtainable, in sufficient quantity, without invasive 
procedures as required by biopsy and surgery. Analysis of fluid samples such as blood. 



wo 00/65067 



PCTAJSOO/10920 



serum and urine will generally be in the diagnostic formats described above and below 
which measure ARSDRl, TMPRSS2, or P.^T-1 polypeptide levels or activity. As the 
inventive polypeptides are circulating in bodily fluids, the methods will be similar to 
those which measure expression levels from cell lysatcs, fractionated portions thereof or 
5 purified components. 

Prostate neoplastic conditions and prostate pathologies can be diagnosed, 
predicted or prognosed by measuring the expression levels of the polynucleotides and 
polypeptides of the present invention iji a prostate eel! sample, circulating fluid or other 
bodily fluid obtained from the individual. As described above, expression levels can be 

10 measured by a variety methods known in the an For example, the expression level of a 
nucleic acid of the invention can be determined by measuring the amount of an RNA or 
polypeptide of the invention in a sample from the individual. Alternatively, the 
expression level of the inventive polypeptides can be determined by measuring the 
amount of enzyme activity in the sample, the amount of activhy being indicative of the 

15 expression level of the inventive polynucleotide. 

Given the teachings and guidance provided herein, the choice of measuring RNA, 
polypeptide or activity will be that of the user. Considerations such as the sample type, 
availability and amount will also influence selection of a particular diagnostic format. 
For example, if the sample is a prostate cell sample and there is only a small amount 

20 available, then diagnostic formats which measure the amount of RNA by, for example, 
PCR amplification, can be an appropriate choice for determining the expression level of a 
polynucleotide of the invention. Alternatively^ if the sample is a blood sample and the 
user is analyzing numerous different samples simultaneous, such as in a clinical setting, 
then a multi sample format,- such as an Enzyme Linked Immunoabsorbant Assay 

25 (ELISA), which measures the amount of polypeptide can be an appropriate choice for 
determining the expression level of a polypeptide of the invention. Additionally, 
polynucleotides of the invention released into bodily fluids from the neoplastic or 
pathological prostate cells can also be analyzed by, for example, PCR or RT-PCR. Those 
skilled in the art v^nll know, or can determine which format is amenable for a particular 

30 application and which methods or modifications known within the art are compatible 
with a particular type of format. 

Hybridization methods are applicable for measuring the amount of inventive 
RNA as an indicator of expression levels. There are numerous methods well known in 
the art for detecting polynucleotides by specific or selective hybridization with a 

35 complementary probe. Such methods include both solution hybridization procedures and 
solid-phase hybridization procedures where the probe or sample is immobilized to a solid 
support. Descriptions for such methods can be found in, for example, Sambrook et al., 
supra, and in Ausubel et al., supra. Specific examples of such methods include PCR and 
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other amplification methods such as RT-PCR, 5' or 3' RACE, RNase protection, RNA 
blot, dot bloT or other membrane-based Technologies, dip stick, pin, ELISA or two- 
dimensional arrays immobilized onto chips as a solid suppon. These methods can be 
performed using either qualitative or quantitative measurements, all of which are well 
5 known to those skilled in the art. 

PGR or RT-PCR can be used with isolated RNA or crude cell lysate preparations. 
As described previously, PGR is advantageous when there is little starting material. A 
further description of PCR methods can be found in, for example, Dieffenbach, C.W., 
and Dveksler, G.S., PCR Primer: A Laborator>^ Manual , Cold Spring Harbor Press, 

10 Plainsview, New York (1995). Multi sample formats such as an ELISA or two- 
dimensional array offer the advantage of analyzing numerous, different samples in a 
single assay, A particular example of a two-dimensional array used in a hybridization 
format is described further beiow in the Examples. In contrast, solid-phase dip stick- 
based methods offer the advantage of being able to rapidly analyze a patient's fluid 

15 sample and obtain an immediate result. 

Polynucleotide probes useful for measuring the expression level of the 
polynucleotides of the invention by hybridization include, for example, all of the 
polynucleotides probes described previously. More specifically, ARSDRl probes 
include, for example, polynucleotides corresponding to the entire transcribed region of 

20 SEQ ID N0:1 and fragments thereof Similarly, TMPRSS2, PART-1, and 8C3 probes 
include, for example, polynucleotides coaesponding to the entire polynucleotide 
sequences designated as SEQ ID N0S;1, 3, 5,7 and fragments thereof, respectively. 

Briefly, for detection by hybridization, the polynucleotides probes of the 
invention having detectable labels are added to a prostate cell sample or a fluid sample 

25 obtained from the individual having, or suspected of having a prostate neoplastic 
condition or pathology under condhions which allow annealing of the probe to RNA. 
Such conditions are well known in the art for both solution and solid phase hybridization 
procedures. Moreover, optimization of hybridization conditions can be performed, if 
desired, by hybridization of an aliquot of the sample at different temperatures, durations 

30 and in different buffer conditions. Such procedures are routine and well known to those 
skilled. Following annealing, the sample is washed and the signal is measured and 
compared with a suitable control or standard value. The magnitude of the hybridization 
signal is directly proportional to the expression levels of the polynucleotide of the 
invention for which the probe was specific. 

35 A suitable control for comparison can be, for example, the expression level of a 

polynucleotide of the invention from a prostate cell or a fluid sample obtained from a 
normal individual. Another suitable control for comparison is a prostate cell line that is 
androgen-dependent. .ARSDRl, TMPRSS2, PART-1, and 8C3 expression levels in cell 
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lines generally should be determined under androgen depleted grow^th conditions, as their 
response to androgen stimulation will be indicative of their respective expression levels 
in neoplastic cells. The control sample for comparison can be measured simultaneously 
with one or more test samples or, alternatively, expression levels can be established for a 
5 particular type of sample and standardized to internal or external parameters such as 
polypeptide or polynucleotide content, cell number or mass of tissue. Such standardized 
control samples can then be directly compared with results obtained from the test sample. 
An increase of two-fold or more of expression levels of a polynucleotide of the invention 
indicates the presence of a prostate neoplastic condition or pathology in the tested 
10 individual. 

The diagnostic procedures described above and below using ANSDRl, 
TiVlPRSSZ, PART-), and 8C3 polynucleotide and polypeptide probes can additionally be 
used in conjunction with other prostate markers, such as prostate specific antigen (PSA), 
human glandular kallikrein 2 (hk2) and prostase/PRSSlS for simultaneous or 

15 independent corroboration of a sample. Moreover, while the diagnositic procedures 
described above and below describe using ANSDRl, TMPRSS2, PART-1, and 8C3 
individually, these markers can also be used in combination. Those skilled in the art will 
know which markers are applicable for use in conjunction with a polynucleotide or 
polypeptide of the invention to delineate more specific diagnostic information such as 

20 that described above. 

Therefore, the invention provides a method of diagnosing or predicting the 
susceptibility of a prostate neoplastic condition in an individual suspected of having a 
neoplastic condition of the prostate where the expression level of a polynucleotide of the 
invention is determined by measuring the amount of its respective RNA. The amount of 

25 ANSDRl, TMPRSS2, PART-1, and 8C3 RNA can be determined by hybridization with 
a polynucleotide probe having substantially the nucleotide sequence of SEQ ID N0S:1, 
3, 5, 7, or functional fragment thereof, respectively, and wherein the fragment consists of 
an oligonucleotide of about 15-18 nucleotides in length. 

The invention additionally provides a method of diagnosing or predicting the 

30 susceptibility of a prostate neoplastic condition in an individual suspected of having a 
neoplastic condition of the prostate where the expression level of an inventive 
polypeptide is determined by measuring the amount of polypeptide. The method 
comprises contacting a cell, a cell lysate, or fractionated sample thereof^ from an 
individual suspected of having a neoplastic condition with a binding agent selective for 

35 one of the inventive polypeptides, and determining the amount of selective binding of the 
agent. The fractionated sample can be a cell lysate or lipid membranes and the binding 
agent can be an antibody or a non-hydrolizable substrate analog depending upon which 
inventive polypeptide is being assayed. For example, when the assay is directed to 
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PART-1 the fraction can be lipid membranes and the selective binding agent can be an 
antibody. Alternatively, when the assayed polypeptide is ARSDRl, the fractionated 
sample can be a cell lysate and the binding agent can be an antibody or non-hydrolizable 
short-chain dehydrogenase/reductase substrate analog. 
5 Essentially all modes of affinity binding assays are applicable for use in 

determining the amount of a polypeptide of the invention in a sample. Such methods are 
rapid, efficient and sensitive. Moreover, affinity binding methods are simple and can be 
adjusted to be performed in a variety of clinical settings and under conditions to suit a 
variety of particular needs. Affinity binding assays which are known and can be used in 

10 the methods of the invention include both soluble and solid phase formats. A specific 
example of a soluble phase affinity binding assay is immunoprecipitation using an 
antibody selective for a polypeptide of the invention or other binding agent, such as, for 
example a steroid or steroid derivative for ARSDRl. Solid phase formats are 
advantageous for the methods of the mvention since they are rapid and can be performed 

15 more easily on multiple different samples simultaneously without losing sensitivity or 
accuracy. Moreover, solid phase affinity binding assays are further amenable to high 
throughput screening and automation. 

Specific examples of solid phase affinity binding assays include immunoaffinity 
binding assays such as an ELISA and radioimmune assay (RIA). Other solid phase 

20 affinity binding assays are known to those skilled in the art and are applicable to the 
methods of the invention. Although affmity binding assays are generally formatted for 
use with an antibody that is selective for the analyte or ligand of interest, essentially any 
binding agent can be alternatively substituted for selectively binding the antibody. Such 
binding agents include, for example, steroids, steroid derivatives, macromolecules such 

25 as polypeptides, peptides, nucleic acids, lipids and sugars as well as small molecule 
compounds. Other binding agents selective for ARSDRl and TMPRSS2 include, for 
example, non-hydrolizable short-chain dehydrogenase/reductase substrate analogs and 
non-hydrolizable serine protease substrate analogs, respectively. Methods are known in 
the art for identifying such molecules which bind selectively to a particular analyte or 

30 ligand and include, for example, combinatorial libraries. Thus, for a molecule other than 
an antibody to be used in an affinity binding assay, all that is necessary is for the binding 
agent to exhibit selective binding activity for the inventive polypeptide. 

Various modes of aifmity binding formats are similarly known which can be used 
in the diagnostic methods of the invention. For the purpose of illustration, particular 

35 embodiments of such affinity binding assays will be described hjrther in reference to 
immunoaffinity binding assays. The various modes of affinity binding assays, such as 
immunoaffinity binding assays, include for example, solid phase ELISA and RIA as well 
as modifications thereof Such modifications thereof include, for example, capture 
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assays and sandwich assays as well as the use of either mode in combination with a 
competition assay format. The choice of which mode or format of immunoaffinity 
binding assay to use will depend on the intent of the user. Such methods can be found 
described in common laboratory manuals such as Harlow and Lane, Using Antibodies; A 
5 . Laboratory Manual , Cold Spring Harbor Laboratory Press, New York (1999). 

As with the hybridization methods described previously, the diagnostic formats 
employing affinity binding can be used in conjunction with a variety of detection labels 
and systems known in the an to quantiiate amounts of a polypeptide of the invention in 
the analyzed sample. Detection systems include the detection of bound polypeptide of 

10 - the invention by both direct and indirect means. Direct detection methods include 
labeling of an antibody or binding agent that binds specifically to a polypeptide- of the . 
invention, indirect detection systems include, for example, the use of labeled secondary 
antibodies and binding agents. 

Secondary antibodies, labels and detection systems are well known in the an and 

15 can be obtained commercially or by techniques well known in the art, The detectable 
labels and systems employed with a binding agent that is specific to a polypeptide of the 
invention should not impair binding of the agent to its cognate inventive polypeptide. 
Moreover, multiple antibody and label systems can be employed for detecting bound 
antigen/antibody complexes of the invention to enhance the' sensitivity of the binding 

20 assay if desired. 

As with the hybridization formats described previously, detectable labels can be 
essentially any label that can be quantitated Or measured by analytical methods. Such 
labels include, for example, enzymes, radioisotopes, fluorochromes as well as chemi- and 
bioluminescent compounds. Specific examples of enzyme labels include horseradish 

25 peroxidase (HRP), alkaline phosphatase (AP), ^-galactosidase. urease and luciferase. 

A horseradish-peroxidase detection system can be used, for example, with the 
chromogenic substrate tetramethylbenzidine (TMB), which yields a soluble product in 
the presence of hydrogen peroxide that is detectable by measuring absorbance at 450 nm. 
An alkaline phosphatase detection ,system can be used with the chromogenic substrate 

30 p-nitrophenyl phosphate, for example, which yields a soluble product readily detectable 
by measuring absorbance at 405 nm. Similarly, a P-galactosidase detection system can 
be used with the chromogenic substrate o-nitrophenyl-p-D-galactopyranoside (ONPG), 
which yields a soluble product detectable by measuring absorbance at 41G nm, or a 
urease detection system can be used with a substrate such as urea-bfomocresol purple 

35 (Sigma Immunochemicals, St. Louis, MO). Luciferin is the substrate compound for 
luciferase which emits light following ATP-dependent oxidation. 

Fluorochrome detection labels are rendered detectable through the emission of 
light of ultraviolet or visible wavelength after excitation by light or another energy 
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source. DAPl, fluorescein, Hoechst 33258. R-phycocyanin. B-phycoerythrin, 
R-phycoerythrin, rhodaniine, Texas red and lissamine are specific examples of 
fluorochrome detection labels that can be utilized in the affinity binding formats of the 
invention. Particularly useflil fluorochromes include fluorescein and rhodamine. 
5 Chemiluminescent as well as bioluminescent detection labels are convenient for 

sensitive, non-radioactive detection of the inventive polynucleotides and polypeptides 
and can be obtained commercially from various sources such as Amersham Lifesciences, 
Inc. (Arhngton Heights, IL). 

Radioisotopes can alternatively, be used as detectable labels for use in the binding 
10 assays of the invention. Iodine-125 is a specific example of a radioisotope useful for a 
detectable label. 

Signals from detectable labels can be analyzed, for example, usmg a 
spectrophotometer to detect color from a chromogenic substrate; a fluorometer to detect 
fluorescence in the presence of light of a certain wavelength; or a radiation counter to 

15 detect radiation, such as a gamma counter for detection of iodine-125. For detection of 
an enzyme-linked secondary antibody, for example, a quantitative analysis of the amount 
of bound agent can be made using a spectrophotometer such as an EMAX Microplate 
Reader (Molecular Devices, Menlo Park. CA) in accordance with the manufacturer's 
instnjctions. If desired, the assays of the invention can be automated or performed 

20 robotically, and the signal from multiple samples can be detected simuhaneously. 

The diagnostic formats of the present invention can be forward, reverse or 
simultaneous as described in U.S. Patent No. 4,376,110 and No. 4,778,751. Separation 
steps for the various assay formats described herein, including the removal of unbound 
secondary antibody, can be performed by methods known in the art (Harlow and Lane, 

25 supra). For example, \yashing with a suitable buffer can be followed by filtration, 
aspiration, vacuum or magnetic separation as well as by centrifligation. 

A binding agent selective for a polypeptide of the invention also can be utilized in 
imaging methods that are targeted at prostate cells expressing the nucleic acids of the 
invention These imaging techniques will have utility in identification of residual 

30 neoplastic cells at the primary site following standard treatments including, for example, 
radical prostatectomy, radiation or hormone therapy. In addition, imaging techniques 
that detect neoplastic prostate cells have utility in detecting secondary sites of metastasis. 
A binding agent specific for one of the polypeptides of the invention can be radiolabeled 
with, for example, ^^Mndium and infused intravenously as described by Kahn et al. 

35 {Journal of Urology \S2:\952A9SS (1994)), The binding agent selective for a 
polypeptide of the invention can be, for example, a monoclonal antibody selective for 
any one of the inventive polypeptides. Imaging can be accomplished by, fur example, 
radioimmunoscintigraphy as described by Kahn et al., sitpra. 
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The invention additionally provides a method of diagnosing or predicting the 
susceptibility of a prostate neoplastic condition in an individual suspected of having a 
neoplastic condition of the prostate where the inventive polypeptide expression level is 
determined by measuring the amount of ARSDRi or TMPRSS2 enzyme activity. In the 
5 case of ARSDRI, the method comprises contacting a cell, a cell lysate, or fractionated 
sample thereof, from the individual with a with a short-chain dehydrogenase/reductase 
substrate selective for ARSDRI, and determining the amount of product formed by 
ARSDRI. When ARSDRI activity is used in the method the fractionated sample can be 
cell lysate. Alternatively, when TMPRSS2 is being assayed the inventive method 

10 comprises contacting a cell, a cell lysate, or fractionated sample thereof, from the 
individual with a serine protease substrate selective for T^/IPRSS2, and determining the 
amount of cleavage product produced by TMPRSS2. When TMPRSS2 activity is used 
in the method the fractionated sample can be lipid membranes. 

Another diagnostic format which can be used for determining the expression 

15 levels of ARSDRI and TMPRSS2 is by measuring the activity of ARSDRI shon-chain 
dehydrogenase/reductase activity and serine protease activity, respectively, in a sample. 
As with the hybridization and affinity binding formats, activity assays can similarly be 
performed using essentiaily identical methods and modes of analysis. Therefore, solution 
and solid phase modes, including multi sample ELISA, RIA and two-dimensional array 

20 procedures are applicable for use in measuring the short-chain dehydrogenase/reductase 
activity of ARSDRI and the serine protease activity of TMPRSS2. In the case of 
ARSDRI, activity can be measured by, for example, incubating a short-chain 
dehydrogenase/reductase substrate with the sample and determining the amount of 
product formation from the short-chain dehydrogenase/reductase substrate. When 

25 TMPRSS2 activity is being measured, a serine protease substrate is incubated with the 
sample and the amount of protein cleavage is determined. In either case, the enzyme 
products can be measured using, for example, any of the detectable labels and detection 
systems described previously. 

When ARSDRI activity is monitored, the amount of product formed or rate of 

30 product formation can be measured either indirectly by measuring the appearance of 
reduced coenzyme or disappearance of non-reduced coenzyme or, can be measured 
directly by measuring the appearance of product or disappearance of substrate. The 
amount of product formation can be measured indirectly by measuring the appearance of 
reduced coenzyme, for example, NADH or NADPH, indicating that ihe substrate has 

35 been oxidized in the ARSDRI -catalyzed reaction. Conversely, the amount of product 
fomied or rate of produa formation can be measured indirectly measuring the 
disappearance of non-reduced coenzyme, for example, NAD* and NADP^, indicating 
that the coenzyme has been reduced in the ARSDRI catalyzed reaction. In addition, the 
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appearance of product and disappearance of substrate can also be used to measure the 
activity of ARSDRl. The magnitude of product formed will directly correlate with the 
ARSDRl activity in the sample and therefore, with the expression levels of AKSDRl in 
the sample. 

5 Methods applicable for determining the activiiy of ARSDRl in a sample include, 

for example, determining the presence of short-chain dehydrogenase/reductase substrates 
such as steroids or steroid derivatives containing hydroxy! groups and short-chain 
dehydrogenase/reductase coenzymes such as pyridine nucleotides N.AD^ and NAD?^ ur 
derivatives thereof Derivatives can further exhibit the capability of releasing a dye or 

10 fluurochrome, for example, upon chemical modification by ARSDRl such as the 
oxidation of the substrate or reduction of the coenzyme. The difference in light 
absorbance between the oxidized and reduced forms of coenzyme is routinely 
distinguished by spectral measurements well known in the art. For example, NADH and 
NADPH are characterized by maximal absortion at about 340nm, while the non-reduced 

15 forms, NAD"*" and NADP^, absorb maximally at about 260mm. Methods useful for the 
detection of changes in polarity are useful for measuring the disappearance of substrate 
and appearance of product and can include, for example, thin layer chromatography 
(TLC), nuclear magnetic resonance spectroscopy (NMR) and infrared spectroscopy. 
Short-chain dehydrogenase/reductase substrates, coenzymes and their respective 

20 derivatives are well known in the art and are similarly applicable in the methods of the 
invention for determining ARSDRl activity in a sample. 

Substrates applicable for determining the activity of TMPRSS2 in a sample 
include, for example, serine protease substrates such as Lys and Arg containing 
polypeptides and peptides. Specific examples of TMPRSS2 substrates include PSA, hk2, 

25 semenogelin, hemoglobin, glucagon, and casein, all of which can be obtained from 
commercial sources. Peptides of these polypeptides can additionally be used as 
TMPRSS2 substrates so long as they contain a Lys or Arg residue. In addition, serine 
protease substrate analogs also can be used for determining the amount of TMPRSS2 
activity in a sample. Such analogs can further exhibit the capability of releasing a dye or 

30 fluorochrome, for example, upon cleavage by TMPRSS2. A serine protease analog 
capable of releasing dye is azo dye-impregnated collagen, which is also available 
commercially Other serine protease substrates and analogs are well known to those 
skilled in the art and are similarly applicable in the methods of the invention for 
determining TMPRSS2 activity in a sample. 

35 The invention fUnher provides a method of identifying a compound that inhibits 

the activity of an inventive polypeptide. The method consists of contacting a sample 
containing the inventive polypeptide and an appropriate substrate, with a test compound 
under conditions that allow product formation from the substrate, and measuring the 



wo 00/65067 



PCT/USOO/10920 



-42- 

amount of the product formation from the substrate. A decrease in the amount of product 
formation from the inventive polypeptide substrate in the presence of the test compound 
compared to the absence of the test compound indicates that the compound has inhibitory 
activity towards the inventive polypeptide activity, Similarly, compounds that increase 
5 the activity of an inventive polypeptide also can be identified. A test compound added to 
a sample containing an inventive polypeptide and an appropriate substrate which 
increases the amount of product or rate of product formation chemical modification of 
the substrate conipared to the absence of the test compound indicates that the compound 
increases the activity of the inventive polypeptide. Therefore, the invention provides a 

10 method of identifying compounds thai modulate the activity of the polypeptides of the 
present invention. The polypeptide containing sample used for such a method can be 
serum, prostate tissue, a prostate cell population or a recombinant cell population 
expressing the inventive polypeptide. 

The methods for determining the activity of an inventive polypeptide in a sample 

15 described above can also be adapted for screening test compounds to determine their 
ability to inhibit or increase product formation catalyzed by an inventive polypeptide 
from its substrates. In such cases, a test compound is added to a reaction system and the 
effect of the test compound on production of product is observed. Those compounds 
which inhibit the product formation or rate of product formation are considered as 

20 potential antagonists of the inventive polypeptides and fijrther as potential therapeutic 
agents for treatment of neoplastic conditions of the prostate. Similarly, those compounds 
which increase the product or rate of product formation are considered as potential 
agonists of the inventive polypeptides and further as potential therapeutic agents for the 
treatment of neoplastic conditions of the prostate. 

25 A reaction system for identifying a compound that inhibits or enhances the 

activity of the inventive polypeptides can be performed using essentially any source of 
inventive polypeptide activity. Such sources include, for example, a prostate cell sample, 
lysate or fractionated portion thereof; a bodily fluid such as blood, serum or urine from 
an individual with a prostate neoplastic condition; a recombinant cell or soluble 

30 recombinant source, and an w virro translated source. The source of inventive 
polypeptide is combined with an appropriate substrate as described above and incubated 
in the presence or absence of a test inhibitory compound. The reaction rate or extent of 
the iisage of the substrate in the presence of the test compound is compared with that in 
the absence of the test compound. Those test compounds which provide inhibition of the 

-35 reaction activity of at least about 50% are considered to be inhibitors of the inventive 
polypeptides. Similarly, those compounds which increase the reaction activity of two- 
fold for more are considered to be enhancers of the activity of the inventive polypeptides. 
Such inhibitors of the inventive polypeptides can then be subjected to further in vitro or 
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polypeptides in cellular and animal models. 

Suitable test cotnpounds for the inhibition or enhancement assays can be any 
substance, molecule, compound, mixture of molecules or compounds, or any other 
5 composition which is suspected of being capable of inhibning inventive polypeptide 
activity in vivo or in vitro The test compounds can be heterocyclic organic compounds 
such as steroids or steroid derivatives, macromolecules, such as biological polymers, 
including proteins, polysacchrides and nucleic acids. Sources of test compounds which 
can be screened for inhibitory activity against the inventive polypeptides include, for 

10 example, libraries of peptides, polypeptides, DNA, RNA and small organic compounds. 
.Additionally, test compounds can be preselected based on a variety of criteria. For 
example, suitable test compounds for ANSDRl can be selected as having known short- 
chain dehydrogenase/reductase inhibition or enhancement activity. Suitable test 
compounds for TMPRSS2 can be selected as having known serine protease inhibition or 

15 enhancement activity. Specific examples of such serine protease inhibitory test 
compounds include chymostatin, Aprotinin, Propionyl-leupeptin hemisulfate, 4-(2- 
Aminoethy!) benzenesulfonyl fluoride hydrochloride, and N-(N-Tosyl-L-phenylalanyl)- 
2-aminoacridone. Alternatively, test compounds can be selected randomly and tested by 
the screening methods of the present invention. Test compounds are administered to the 

20 reaction system at a concentration in the range from about 1 nM to 1 mM. Useful test 
compounds such as steroids and steroid derivatives are lipophilic, thus allowing them to 
cross the cell membrane. In addition, routine ligand specific targeting methods are useful 
for testing compounds for inhibitory activity. 

Therefore, the invention provides a method of identifying a compound that 

25 inhibits or enhances the activity of an inventive polypeptide where the sample further 
consists of a prostate ceil lysate, a recombinant cell lysate expressing one of the inventive 
polypeptides, an //; vilro translation lysate containing mRNA encoding one of the 
inventive polypeptides, a fractionated sample of a prostate cell lysate, a fractionated 
sample of a recombinant cell lysate expressing one of the inventive polypeptides, a 

30 fi'actionated sample of an in vitro translation lysate containing mRNA encoding one of 
the inventive polypeptides or an isolated inventive polypeptide. The method can be in 
single or multiple sample format. 

In another embodiment, polypeptides and peptides of the invention can be used as 
vaccines to prophylactically treat individuals for the occurrence of a prostate neoplastic 

35 condition or pathology. Such vaccines can be used to induce B or T cell immune 
responses or both aspects of the individuals endogenous immune mechanisms. The mode 
of administration and formulations to induce either or both of these immune responses 
are well known to those skilled in the art. For example, polypeptides and peptides of the 
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invention can be administered in many possible formulations, including pharmaceutically 
acceptable mediums. They can be administered alone or, for example, in the case of a 
peptide, the peptide can be conjugated to a carrier, such as KLH, in order to increase its 
immunogenicity. The vaccine can include or be administered in conjunction with an 
5 adjuvant, various of which are known to those skilled in the art. After initial 
immunization with the vaccine, fijrther boosters can be provided if desired. Therefore, 
the vaccines are administered by conventional methods in dosages which are sufficient to 
elicit an immunological response, which can be easily deLeiniined by those skilled in the 
art. Alternatively, the vaccines can comprise anti-idiotypic antibodies which are internal 

10 images of the inventive polypeptides and peptides described above. Methods of making, 
selecting and administering such anti-idiotype vaccines are well known in the art. See, 
for example, Eichmann, et al., CRC Critical Reviews in Immunology 7; J 93-227 (1987). 

The invention additionally provides a method of treating or reducing the 
progression of a prostate neoplastic condition. The method consists of administering to 

15 an individual having a neoplastic condition of the prostate an inhibitory amount of an 
inhibitor specific for a polypeptide of the invention, wherein said inhibitory amount 
causes a reduction of at least about 2-fold in the amount or activity of the targeted 
polypeptide. A specific example of a ARSDRl specific inhibitor is a short-chain 
dehydrogenase/reductase inhibitor or an ARSDRl antisense nucleic acid. A specific 

20 example of a TMPRSS2 inhibitor is a serine protease inhibitor or a TMPRSS2 antisense 
nucleic acid. A specific example of an 8C3 specific inhibitor is an 8C3 antisense nucleic 
acid. Similarly, a specific example of PART-'^l specific inhibhor is a PART-1 antisene 
nucleic acid. * ' 

Such inhibitors may be produced using methods which are generally known in the 

25 art, and include the use of purified inventive polypeptide to produce antibodies or to 
screen libraries of compounds, as described previously, for those which specifically bind 
to one of the inventive polypeptides. For example, known inhibitors of oxidoreductases 
belonging to the short-chain dehydrogenase/reductase family that inhibit ARSDRl can 
be used. Lipophilic compounds able to cross the lipid bilayer that makes up cell 

30 membranes are especially useful inhibitors for practicing the methods of the invention. 

Antibodies specific to the polypeptides of the present invention can be used, for 
example, directly as an antagonist, or indirectly as a targeting or delivery mechanism for 
bringing a cytotoxic or cytostatic agent to neoplastic prostate cells. Such agents can be, 
for example, radioisotopes. The antibodies can be generated using methods that are well 

35 known in the art and include, for example, polyclonal, m.onoclonal, chimeric, humanized 
single chain. Fab fragments, and fragments produced by a Fab expression library. 

In another embodiment of the invention, the polynucleotides encoding the 
inventive polypeptides, or any fragment thereof, or antisense molecules, can be used for 
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therapeutic purposes. In one aspect, antisense molecules to the polynucleotides encoding 
the polypeptides of the invention can be used to block the transcription or translation of a 
mRNA homologous to the antisense molecule. Specifically, cells can be transformed 
with sequences complementary tn mRNA transcripts encoding the inventive 
5 polypeptides. Such methods are well known in the art, and sense or antisense 
oligonucleotides or larger polynucleotide fragments, can be designed from various 
locations along the coding or control regions of sequences encoding the inventive 
polypeptides. Thus, antisense molecules may be used to modulate the activity of the 
inventive polypeptides, or to achieve regulation of gene function. 

10 Expression vectors derived from retroviruses, adenovirus, adeno-associated virus 

(AAV), herpes or vaccinia viruses, or from various bacterial plasmids can be used for 
delivery of antisense nucleotide sequences to the prostate ceil population. The viral 
vector selected should be able to infect the tumor cells and be safe to the host and cause 
minimal cell transformation. Retroviral vectors and adenoviruses offer an efficient, 

15 usefial, and well characterized means of introducing and expressing foreign genes 
efficiently in mammalian cells. These vectors are well known in the art and have very 
broad host and cell type ranges, express genes stably and efficiently. Methods which are 
well known to those skilled in the an can be used to construct such recombinant vectors 
and are described in Sambrook et al. [supra). Even in the absence of integration into the 

20 DNA, such vectors can continue to transcribe RNA molecules for a substantial period of 
time. Transient expression can last for a month or more with a non-replicating vector 
and even longer if appropriate replication elements are part of the vector system. 

Ribozymes, enzymatic RNA molecules, can also be used to catalyze the specific 
cleavage of mRNAs encoding the polypeptides of the present invention. The mechanism 

25 of ribozyme action involves sequence-specific hybridization of the ribozyme molecule to 
a complementary target RNA, followed by endonucleolytic cleavage. Specific ribozyme 
cleavage sites within any potential RNA target are identified by scanning the a target 
RNA for ribozyme cleavage sites which include, for example, the following sequences: 
GUA, GUU, and GUC. Once identified, short RNA sequences of between 10 and 20 

30 ribonucleotides corresponding to the region of the target gene containing the cleavage 
site can be evaluated for secondary structural features which can render the 
oligonucleotide inoperable. The suitability of candidate targets can also be evaluated by 
testing accessibility to hybridization with complementary oligonucleotides using 
riboruclease protection assays. Antisense molecules and ribozymes of the invention can 

35 be prepared by any method known in the an for the synthesis of nucleic acid molecules. 

In another enbodiment, the .ARSDRl, TMPRSS2 and PART-1 promoter and 
regulatory regions can be used for constructing vectors for prostate cancer gene therapy. 
The promoter and regulatory region can be operably fused to a therapeutic 
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polynucleotide for prostate specific expression. This method can include the addition of 
one or more enhancer elements which amplify expression of the heterologous therapeutic 
polynucleotide without compromising tissue specificity. 

Examples of therapeutic polynucleotides that are candidates for prostate gene 
5 therapy utilizing the ARSDRl, TN'fPRSSZ and PART- 1 promoters include suicide genes. 
The expression of suicide genes produces a polypeptide or agent that directly or 
indirectly inhibits neoplastic prostate cell growth or promotes neoplastic prostate cell 
death. Suicide genes include genes encoding enzymes such as thymidine kinase, 
oncogenes, tumor suppressor genes, genes encoding toxins, genes encoding c>4ol<ines, or 

10 a gene encoding oncostatin. The therapeutic polynucleotides of the present invention can 
be expressed using the vectors described previously for antisense expression as well as 
others well known in the art. 

It is understood that modifications which do not substantially affect the activity of 
the various embodiments of this invention are also included within the definition of the 

15 invention provided herein. Accordingly, the following examples are intended to illustrate 
but not limit the present invention. 

EX.AMPLE 1 

Identification of ARSDRI. an Androgen-Regulated Polynucleotide 
This example shows identification of ARSDRl, TMPRSS2, PART-1 and 8C3 as 

20 genes that are transcriptionally-regulated by androgens in human prostate cancer cells, 

To identify genes transcriptionally regulated by androgens, microarrays 
containing prostate derived cDNAs were screened using RNA from a prostate cell line. 
Those RNAs showing increased expression levels in response to androgen stimulation 
were identified and characterized further. Specifically, a non-redundant set of 1500 

25 prostate-derived cDNA clones was identified from the Prostate Expression Database, a 
public sequence repository of expressed sequence tag (EST) data derived from human 
prostate cDNA libraries (Hawkins et al.. Nucleic Acids /?£^5, 27:204-208 (1999)). 
These 1500 unique cDNAs were sequence verified and the clones were stored in 96 well 
microtiter plates. The inserts of the cDNAs were amplified with primers BL_mi3F 

30 (5*-GTAAAACGACGGCCAGTGAATTG-3') (SEQ ID NO: 12) and BL_raI3R 
(5'-ACACAGGAAACAGCTATGACCATG-3') (SEQ ID N0:13). Two ^1 of bacteria 
culture were used as PGR templates. PCR was performed with an initial incubation 
at94''C for 5 minutes, followed by 35 cycles of94°C for 30 seconds, 57*C for 30 
seconds, 72 °C for 5 minutes, and a final extension at 72°C for 7 minutes. PCR products 

35 were purified with Sephacryl S-500 (Pharmacia, Kalamazoo, MI) on 96- well silent 
screen filter plates (Nunc, Rochester NY). The DNA concentration was 200-400 xidjA. 
The purified PCR products were mixed with an equal volume of DMSO and spotted 
twice onto Type IV glass microscope slides (Amersham, Piscataway, NJ) using a 
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Molecular Dynamics (Sunnyvale, CA) Genii robotic spotting tool. After spotting, the 
glass slides were air-dried and UV cross! inked with 500 mJ of energy and then baked 
at 95"C for 30 minutes. 

To identify genes transcription;5ily regulated by androgens, the microarrays of 
5 prostate derived cDNAs were screened using total RNA isolated from LNCap cells 
cultured for 72 hours either in the presence or absence of a synthetic androgen R1881 
(KEN Life Science Products, Boston, MA. 

Total RNA was prepared using TRlzol (Gibco-BRL, Germantown, MD) 
according to the manufacturer's directions. The integrity of the RNA preparation was 

10 checked on a standard formaldehyde agarose gel. Fifty /^g of the total RNA were 
digested with 1 /j\ of RQI RNase-free DNase (Promega, Madison, WI) (l/u/jul) in IX 
first strand cDNA synthesis buffer (Gibco-BRL, Germantown, MD) at 37°C for 30 ' 
minutes. The reaction mix was then extracted with phenol/chloroform (1:1) and RNA 
was precipitated with ethanol. The mRNA was isolated from the DNA-free total RNA 

15 using a Dynabeads mRNA purification kit (Dynai, Lake Success, NY). LNCaP cells 
were cultured as follows. The culture medium for LNCap cells was RPMl 1640 with 5% 
FES (Gibco-BRL, Germantown, MD). For androgen experiments, 6 flask (175 cm2) of 
LNCaP cells were starved for androgens by culturing in CS media (RPMI 1640 with 5% 
of charcoal filtered FBS). After 72 hours of incubation, three flasks were incubated with 

20 CS media and the other three were incubated with CS media plus 1 nM of synthetic 
androgen R1881. All LNCaP cells were incubated for additional 72 hours and then 
harvested. 

Fluorescence-labeled probes were constructed from the above-isolated mRNA as 
follows. One jug of mRNA or 30 /ig of total RNA was mixed with of anchored oligo 

25 dT primer (Amersham, Piscataway, NJ), incubated at 70 °C for 10 minutes and then 
chilled on ice. Four ii\ of5X first, strand cDNA synthesis buffer (Gibco-BRL, 
Germantown, MD), 2 /^l of 0.1 M DTT (Gibco-BRL, Germantown, MD), 1 ^1 of HPRI 
(20 iu/}j,\) (Amersham, Piscataway, NJ), 1 jul of dNTP mix (Amersham, Piscataway, NJ) 
containing 2mM dATP, 2mM dGTP, 2mM dTTP and ImM dCTP, 1 ^\ of Cy3 dCTP 

30 (ImM) (Amersham, Piscataway, NJ) and 1 ^\ of Superscript II RT (200 fxf/ul) were 
added. The reactants were incubated at 42^C for 2 hours, Following first strand cDNA 
labeling, the reaction mixture was incubated at 94 °C for 3 minutes. Unlabeled RNA was 
hydrolyzed by the addition of 1 u.\ of 5N NaOH and incubation ai 37''C for 10 minutes. 
One /il of 5M HCl and 5 jul of IM Tris-HCi (pH7.5) were added after the incubation to 

35 neutralize the reaction mixture. The mixture was then purified by the Qiagen PCR 
purification kit (Qiagen, Valencia, CA) following the manufacturer's protocol except 
washing twice with PE buffer. Following the purification, DNA was eluted with 30 
of distilled H20. 
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Microarray hybridization was performed as follows. One ^\ of dA/dT (12-18) 
{iluoJluX) (Pharmacia, Kalamazoo. MT) and 1/^1 of human CotI DNA (\!J.gl^^) 
(Gibco-BRL, Germantovvn, MD) were added to the probe. The reaction mixture was 
then heat denatured at 94''C for 5 minutes. An equal volume of 2X Microarray 
5 Hybridization Solution (Amersham, Piscataway, NJ) was added and the mixture was 
prehybridized at 50° C for 1 hour. After prehybridization, the probe mixture was placed 
onto a microarray slide with a coverslip. The hybridization was carried out in a humid 
chamber at 52°C for 16 hours. After hybridization, the slides were washed once with IX 
SSC, 0.2% SDS at room temperature for 5 minutes on a shaker, then twice washed with 

10 O.IX SSC, 0.2% SDS at room temperature for 10 minutes. After washing, the slide was 
dipped into distilled water to remove traces of sah and SDS. Finally, the slides were 
dried with compressed air. 

Analysis of the microarray slides to was performed to identify cDNAs that show 
increased expression levels in response to androgen stimulation. Hybridized microarray 

15 slides were scanned with an Array Scanner Generation IT (Amersham, Piscataway, NJ). 
Intensity data were integrated at a pixel resolution of 10 micrometers using 
approximately 20 pixels per spot, and recorded at 16 bits. Local background 
hybridization signals were subtracted prior to comparing spot intensities and determining 
expression ratios. For each experiment, each cDNA was represented twice on each slide, 

20 and the experiments were performed in duplicate producing four data points per cDNA 
clone per hybridization probe. Intensity ratios for each cDNA clone hybridized with 
probes derived from androgen-stimulated LNCaP and androgen-star\'ed LNCaP were 
calculated (stimulated intensity/starved intensity). A gene expression level change was 
treated as significantly different between the two conditions if all four replicate spots for 

25 a given cDNA demonstrated a ratio greater than 2 or less than and the signal intensity 
was greater than 2 standard deviations, above the image background. It had been 
determined previously that expression ratios less than 2-fold are not reproducible in this 
system. 

Of. a total of 1500 distinct cDNAs represented on the microarray 10 cDNAs were 
30 identified that upon androgen stimulation exhibited signal intensities at least 1.5 times of 
local background and exhibited ratios between androgen stimulated and androgen starved 
cells that were consistently larger than 1.5. These included PSA and hK2, two genes 
containing androgen response elements located in the 5 '-flanking regions that have been 
shown to confer androgen responsiveness by functional studies (^ Riegman et al., 
35 Molecular Endocrinology 5: 1921-1930 (1991); Murtha et al., Biochemistry 
32: 6459-6464(1993)) 

Also among the identified cDNAs were four cDNAs that are the subject of the 
present invention and are referred to as ARSDRl, TMPRSS2, PART-1 and 8C3. 
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Sequence analysis and BLAST searches of the sequence of cDNA6A4 against the 
GenBank databases identified 6A4 as encoding a short chain dehydrogenase/reductase, 
On this basis the polynucleotide encoded by the 6A4 cDNA was named AK.SDR1. The 
sequence searches also demonstrated that ARSDRl is a novel polynucleotide. A portion 
5 of .ARSDRl matches to Est AA657851 (IMAGE ID: 1207405), but shows no matches to 
any known genes in the non-redundant subdivision of the GenBank databases. 

cDNAlODll was found to be homologous to a serine protease termed 
TMPRSS2. Full-length sequencing of the microarray cDNA confirmed the identity of 
lODll as TMPRSS2 and added additional 3' sequence information to the mRNA 

10 sequence available in the public databases. The expression level of TMPRSS2 increased 
six-fold in androgen stimulated LNCaP cells relative to androgen-deprived cells as 
assayed by microarray hybridization. 

cDNA 14D7, an unknown cDNA as confirmed by BLAST searches against the 
non-redundant subdivision of the GenBank database. The polynucleotide identified from 

15 the initial 14D7 cDNA clone was termed PART-1 for "Prostate .Androgen-Regulated 
Transcript- 1 

cDNA 8C3 was also shown to be an unknown cDNA as confirmed by BLAST 
searches against the non-redundant subdivision of the GenBank database. 

EXAMPLE 2 

20 Confirmation of the Androgen-Regulated Expression of ARSDRl 

To show up-regulation of ARSDRl cDNA in response to androgen stimulation a 
RNA blot containing the same RNAs used for the microarray hybridization was 
hybridized with ARSDRl cDNA and control G3PDH cDNA. The RNA blots were made 
by fractionating 10 jUg total RNA on a 1.2% formaldehyde gel and blotting onto nylon 

25 filters (Sambrook et al., T. Molecular Cloning , Cold Spring Harbor, NY: Cold Spring 
Harbor Laboratory Press (1989)). The ARSDRl and G3PDH probes were labeled with 
[a-^^ P] dCTP (Amersham, Piscataway, NJ) using a rediprime II random primer labeling 
system (Amersham, Piscataway, NJ) and purified with Sephadex G50 Nick column 
(Pharmacia, Kalamazoo, MI). RNA hybridization confirmed the microarray hybridization 

30 results that ARSDRl is up-regulated by androgens Quantification by ImageQuant 
program (Molecular Dynamics, Sunnyvale, CA) revealed that ARSDRl expression levels 
in androgen stimulated LNCaP cells is about 15 times higher than in androgen starved 
LNCaP cells. 

To investigate whether the clones obtained as described above represented full 
35 length transcripts, 5' rapid 'amplification of cDNA ends (5'RACE) from human prostate 
Marathon-Ready cDNA (Clontech, Palo Alto, CA) was performed using 
primers 6A3_RC3 (5'-GGACAGCATTTTCCTGATTTTGGGGGC-3') (SEQ ID NO: 16) 
and 6A4_RC4(5'-CAGAAGGAGGAGCAACAGCGGGAAC-3') (SEQ ID NO: 17), 
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5' RACE was carried out according lo Clontech's protocoi. The RACE products were 
subcloned into PCR2.1-T0P0 vectors (Invitroyen, Carlsbad. CA) with the TOPO TA 
cloning kit (Invitrogen, Carlsbad, CA) and sequenced. 

EXAMPLE 3 

5 ARSDRl is Predominantly Expressed in Prostate Tissue 

This example shows the prostate predominant expression and androgen- 
regulationof AKSDRl. 

The expression profile of ARSDRl in normal human tissues was determined by 
RNA analysis to determine whether ARSDRi exhibits tissue specific expression. A 

10 multiple tissue Nuthern (MTN) blot (Clontech, Palo Alto, CA) containing RNAs from 8 
human tissues and an RNA master blot (Clontech, Palo Alto, CA) containing RNAs 
from 50 human tissues were hybridized with ARSDRl cDNA probe. The 50 human ' 
tissues are: whole brain; amygdala; caudate nucleus, cerebellum; cerebral coaex; frontal 
lobe, hippocampus; medulla oblongata; occipital lobe; putamen; substantia nigra; 

15 temporal lobe; thalamus; acumens; spinal cord; heart; aorta; skeletal muscle; colon; 
bladder; uterus; prostate; stomach; testis; ovary; pancreas; pituitary gland; adrenal gland; 
thyroid gland; salivary gland; mammary gland; kidney; liver; small intestine; spleen; 
thymus; peripheral leukocyte, lymph node; bone marrow; appendix; lung; trachea; 
placenta; fetal brain; fetal heart; fetal kidney; fetal liver; fetal spleen; fetal thymus; fetal 

20 lung; yeast total RNA; yeast tRNA; E. coli rRNA; E, coli DNA; poly r(A); human Cotl 
DNA, human DNA; human DNA; and several no RNA controls. The ARSDRl cDNA 
was used as a probe and was labeled with [a-^^ P] dCTP (Amersham, Piscataway, NJ) 
using the rediprime II random primer labeling system (Amersham, Piscataway, NJ) 
followed by purification with Sephadex G50 Nick column (Pharmacia, Kalamazoo, MI). 

25 RNA hybridization was carried out in ExpressHyb hybdridization solution (Clontech, 
Palo Alto, CA). RNA blots were exposed to a phosphor screen (Molecular Dynamics, 
Sunnyvale, CA) and the images were scanned into a computer with a Phosphorimager. 
Quantification was done using ImageQuant program (Molecular Dynamics, Sunnyvale). 
Overall, not double-counting the six tissues that appeared in both, the MTN blot 

30 and the RNA master blot, ARSDRl expression was analyzed in 52 distinct tissues. 
Among the 52 tissues analyzed, total ARSDRl' is most abundantly expressed in prostate 
tissue. It is also slightly expressed in other tissues such as spleen, thymus, testis, ovary, 
small intestine, colon, peripheral blood leukocyte, and kidney , adrenal giand , and fetal 
liver. 

35 In addition, a RNA blot containing RNAs from cancer cell lines LNCaP, DU145 

and PC3 was made and hybridized with ARSDRl cDNA probe and G3PDH cDNA 
control probe. DUI145 and PC3 are androgen-unresponsive cell lines. The ARSDRl 
cDNA probe was labeled with [a-^^ P] dCTP (.Ajnersham, Piscataway, NJ) using a 
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rediprime II random primer labeling system (Amersham, Piscatawaaay, NJ) and purified 
with Sephadex G50 Nick column (Pharmacia, Kalamazoo, Ml). The RNA blot was 
made by fractionating 10 |ig total RNAs on a 1.2% formaldehyde gel and blotting 
(Sambrook et al, T. Molecular Cloning . Cold Spring Harbor, NY: Cold Spring Harbor 
5 Laboratory Press (1989)). Interestingly, ARSDRl is expressed both in the androgen- 
dependent (AD), AR-containing cell line LNCaP and in the androgen-independent, 
AR-negative cell Imes, DU145 and PC3 cells. 

EXAMPLE 4 
Isolation of the .ARSDRl Full Lenath cDNA 

10 This example shows the isolation and deduced determination of the nucleotide 

and deduced amino acid sequence of the .ARSDRl polynucleotide, which contains 2539 
base pairs and encodes a 318 aa polypeptide. 

To clone a full length ARSDRl cDNA 1.2 millions phage plaques from a human 
prostate 5'-stretch cDNA library (Clontech, Palo Alto. CA) were screened with ARSDRl 

15 probe. The screening procedure utilized had been described by Sambook et al. (T 
Molecular Cloning . Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press 
(1989)). Five cDNA clones were isolated and sequenced. The insens of these cDNAs 
were isolated, subcloned and sequenced. PCR primers 6 A4N1 

(5'-CCAAAGAGCTGGCTCAGAGAGG-3')(SEQ ID NO: 18) and 6A4N2 

20 (5'^CTGGGTGAAGAGGATGTTGGC-3')(SEQID NO: 19) were designed from the 5' 
terminus of the existing cDNA, and used to produce a PCR fragment for additional 
library screening. Eleven additional cDNAs were isolated and sequenced. Furthermore, 
seven IMAGE cDNA clones (IMAGE CloneID:360400, 109237, 1130518, 1401718, 
1337270. 1723130, 1703429) containing .ARSDRl (http;//www- 

25 bio. llnKgov/bbrp/image/image. html) were purchased and sequenced. 

To investigate whether the clone obtained as described above contained the full 
length transcript, 5' rapid amplification of cDNA ends (5'RACE) from human prostate 
Marathon-Ready cDNA (Clontech, Palo Alto, CA) was performed using 
primers 6A3_RC3 (5'-GGACAGCATTTTCCTGATTTTGGGGGC-3')(SEQ ID NO: 16) 

30 and 6A4_RC4(5'-CAG.AAGGAGGAGCAACAGCGGGAAC-3')(SEQ ID NO: 17). 5' 
RACE was carried out according to Clontech's protocol. The RACE products were 
subcloned into PCR2.1-T0P0 vectors (Invitrogen, Carlsbad, CA) with the TOPO TA 
. cloning kit (Invitrogen, Carlsbad, CA) and sequenced. 

Analysis of all of the above clones, revealed a 2539 base pair sequence for 

35 ARSDRl which corresponds to the size of the ARSDRl transcript as determined by 
RKA hybridization. ARSDRl encodes a polypeptide of 31 8 amino acids (SEQ ID 
N0:2). The ARSDRl start codon, has a strong translation start context according to 
similarity to the Kozak translation initiation consensus sequence (Kozak, Mamm. 
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Getwme 7:563-574 (1996)). Two potential polyadenylation signals were identified at 
nucleotide positions 2439 and 2481. IMAGE clone 1703429 has a poly-A stretch that 
uses the AATAAA polyadenylation signal at 2419, while ARSDRl uses the AATAAA 
signal at 2481. PGR primers flanking the start and stop codons were designed and an 
5 expected size band encompassing the coding region was amplified from human prostate 
Marathon-Ready cDNA (Ciontech, Palo Alto, CA). 

EXAMPLE 5 

ARSDRl is a Novel Member of the Short-Chain Dehvdrouenases/Reductases fSDR^ 
This example shows that homology searches showed that ARSDRl is a novel 

10 member of the family of shon-chain dehydrogenases/reductases (SDR). 

BLAST searches were performed and established sequence homology between 
ARSDRl and many oxidoreductases from bacteria and plant sources. Subsequently, the 
polypeptide sequence of ARSDRl was compared to sequences contained in the 
BLOCKS database (http;//www.blocks. fhcrc.org) (HenikofF and HenikofF, Nucleic Acids 

15 Res. 19:6565-6572 (1991)). Blocks are multiply aligned ungapped segments 
corresponding to the most highly conserved regions of proteins. The BLOCKS database 
aids in the detection and verification of polypeptide sequence homology by comparing a 
polypeptide or DNA sequence to a database of polypeptide blocks. The BLOCKS search 
revealed that the ARSDRl polypeptide has three blocks that match to short-chain 

20 dehydrogenases/reductases (SDR) , family protein signature BLOCK (BL00061) with a 
significant combined E-value of2.6e-06 (Jornvall et al., Biochemist}yM:600Z-60\Z 
(1995)). SDRs are a large family of NAD(H>- or NADP(H)-dependent oxidoreductases, 
whose members include many enzymes involved in steroid metabolism such as 
estradiol 17-beta-dehydrogenase (also called l7-beta-hydroxysteroid dehydrogenase) 

25 (EC 1.1.1.62), human 15-hydroxyprostaglandin dehydrogenase (NAD+) (EC 1.1.1.141) 
and ll-beta-hydroxysteroid dehydrogenase (EC 1.1.1.146) (11-DH) (Jornvall et al., 
supra (1995)). Multiple sequence alignments of ARSDRl with different members of the 
human hydroxysteroid dehydrogenases (HSD) and a prokaryotic 20 beta-hydroxy steroid 
dehydrogenase termed Streptomyces 3a/20P-hydroxysteroid dehydrogenase were 

30 performed. The alignment was done with the clustalW algorithm (Thompson et at., Nucl 
Acids Res. 22:4673-4680 (1994)) from Mac Vector 6.0 (Oxford Molecular). BLOSUM 
series matrix, which measures differences between two proteins, was used with an open 
gap penalty score of 10 and an extend gap penalty score of 0.05. The GenBank accession 
numbers for the SDR family members used in the alignment are as follows: 20-beta 

35 HSD_Strex, Streptomyces 3 a/20(3-hydroxysteroid dehydrogenase, PI 9992; 1 1 -beta 
HSDl_human, P28845; ll-beta HSD2-human, U14631; 17-beta_HSD Inhuman, 
P14061; 17-beta_HSSD2-human, L11708; 1 7-beta_HSD3_human, P37058. 
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Only two polypeptide motifs were identified as being conserved in the SDR 
family. The first is a connmon GlyXXXGIyXGly (SEQ ID NO: 14) pattern where the 
coenzyme NAD(H) or N.-^DPCH) binds at the N terminal of the SDR family enzymes 
(Jomvall et al., sjtprci (1995)). The second motif, TyrXXXLys (SEQ ID N0:15), is 

5 indicated to be involved in the catalytic activity of the enzyme (Ghosh et al., 
StrKcmre 2:629-640 (1994)). The ARSDRl polypeptide contains both of these motifs 
represented as amino acids 44 to 50 and 198 to 202, respectively in SEQ ID N0:2. 
Sequence analyses reveal that proteins in the SDR family only exhibit amino acid 
sequence identity of about 15-30%, likely due to their early divergence and remote origin 

10 (Persson et al., Eitr. J. Biochem. 200:537-543 (1991)). ARSDRl shows about 25% 
amino acid sequence identity with other members of the SDR family and was thus 
determined to be a novel member of the SDR family, Because the polypeptide is 
androgen regulated and most predominantly expressed in the prostate, h was named 
androgen regulated short-chain dehydrogenase/reductase 1 (ARSDRl). 

15 Prosite pattern searches revealed that ARSDRl contains two Asn-glycosylation 

sites at amino acid positions 174 and 198 

(http://www.isrec.isb-sib.ch/software/PSTSCAN_form.htm]). These two sites are also 
conserved among SDR family proteins. In addition, two protein kinase C(PKC) 
phosphorylation sites at amino acid positions 57 and 106, a casein kinase II 

20 phosphorylation site at amino acid position 57 and 7 N-myristoylation sites were 
identified in the ARSDRl polypeptide. 

EX,^MPLE 6 
Genomic Organization of .AJISDRI 
This example shows the determination of the ARSDRl promoter and regulatory 

25 as well as coding regions. 

To determine the genomic organization of the ARSDRl polynucleotide ARSDRl 
cDNA sequences were aligned against genomic sequences originating from a 197 kb 
chromosome 14 BAG clone R-1012A1 recently sequenced by the National Sequencing 
Center-Genoscope in France and deposited to GenBank under accession number 
.30 AL049779. BAG clone R-I012A1 contains the whole genomic sequence of the 
ARSDRl cDNA. The ARSDRl polynucleotide has 7 exons and 6 introns. The sizes of 
exons, the sizes of introns, and the exon/imron junctional sequences are listed in Table 1. 
All the intron/exon junctions conform to the 5'-gt..3*-ag consensus except intron 2, which 
has a 5'-gc...3'-ag splicing signal (Breathnach and Chambon, Ami. Rev: Biochem. 50:349- 

35 383 (1981)). The 5'-gc...3'-ag splicing signals have previously been identified in other 
genes (Devireddy and Jones, J. Virol. 72:7294-7301 (1998)). 

TABLE!: Summary of the Genomic Structure of the ARSDR1/6A4 
Polynucleotide 
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E:ion Acceptor Donor Exon Intron 

size (bp) size (bp) 

1 CAGgtctglgcaatgtattgcca >114 2577 

2 ctctccttctctgtctgcagGAA GAGgcaagtlcacctcctttcaa 120 340 

3 tttcatatgttggctgacagGAG GAGgxaagtgtagaactagagag 159' 1193 

4 atcgtcttgttccctgcagaGAA TGGgiaagaaaictggccnatc i04 7iS 

5 attctagtatttctcaacagGTC AAGgtgggcctagaggaaatgaa 211 5007 

6 ttcatgccacccccaaccagGCT CAGgtatgaatgttatctctttt 191 6591 

7 cctttctctttaccttccagTGA 1621 



To characterize the 5' regulatory elements of ARSDRl, 5' genomic sequences 
were examined for potential transcriptional start sites using a. neural network promoter 
5 prediction program f http.//www-hgc.lbl.gov/projects/promoter.html: Reese et aJ., Large 
Scale Sequencing Specific Neural Networks for Promoter and Splice Site Recognition, 
Biocomputing: Proceedings of the 1996 Pacific Symposium , ed. Lawrence Hunter and 
Terri E. Klein, World Scientific Publishing Company, Singapore (1996)). A predicted 
transcription start site 167 base pairs 5' of the ATG start codon was identified. A TATA 

10 box (TATAAT) was found 30 base pairs 3' of the putative transcriptional initiation site. 

To further characterize the 5' genomic region of ARSDRl by identifying other 
potential transcriptional factors the Transcription Element Search Software (TESS) 
program was utilized ( http://www. cbil.upenn.edu/tess/index.html: Schug and Ovenon, 
TESS: Transcription Element Search Software on the WWW, Technical Report 

15 CBIL-TR-1997-lOOl-vO.O of the Computational Biology and Informatics Laboratory. 
School of Medicine, University of Pennsylvania (1997)). A strong promoter sequence 
was identified with a score of 0.87. A score of «0.85 has a 0.1-0.4% false positive 
prediction rate. In addition, a sequence which has 86.7% homology. (13 nucleotides 
out 15) to androgen response element (ARE) consensus sequence (5- 

20 GGA/TACAnnnTGTTCT-3')(SEQ ID NO:20) was identified (Roche et al., Mol. 
Endocrinol. 6:2229-2235 (1992)). Moreover, two sequences which have 86,7% (13 
nucleotides out of 15) homology to the consensus sequence of progesterone responsive 
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elements (PRE) were identified (Lieberman et al., Mol. Endocrinol. 7:515-527(1993)). 
Furthermore, an IL-6 RE-BP (intcrlcukin-6 response element binding protein) site 
TTCCCAGAA (SEQ ID N0:21) was identified 281 bps 5' of the putative transcription 
initiation site (Hocke et dl„ Mol. Cdi Biol. 12:2282-2294 (1992)) 
5 EX.AMPLE 7 

Chromosomal Localization of .ARSPRl 
This example shows the chromosomal localization to human chromosome 14q of 
ARSDRl by polymerase chain reaction (PCR). 

The medium-resolution Stanford G3 radiation hybrid panel was used to map the 
10 chromosome localization of ARSDRl using primers: 

6A4F (5'-GGGGCATTTCCTTACATTGTCCTTG-3') (SEQ ID NO:22) and 
6A4R (5'-CACTCCAAACAAGTGATGGGAACAC-3')(SEQID NO:23). PCR " 
was performed with 35 cycles of 94''C for 30 seconds, followed by 35 cycles at 54 °C 
for 30 seconds and, finally, 35 cycles at 72°C for 30 seconds. The reaction products 
15 were separated on a 1.2% agarose gel and the resulting product pattern was analyzed 
through the Stanford genome web server (www.shgc.stanford.edu) to determine the 
probable chromosomal location. ARSDRl was determined to be localized to SHGC- 
2558 between two cyto genetically mapped markers D4S63 at I4q23 and D4S258 
at 14q24.3 (Genome Database: http://www.gdb.org/). Therefore, ARSDRl is mapped 
20 to 14q23-24.3. This determination is consistent with the fact that the recently sequenced 
BAG clone R-1012A1 (GenBank accession number: AL049779) containing .ARSDRl 
comes from chromosome 14q. 

EXAMPLE 8 

Expression of ARSDRl in Sections of Normal and Adenocarcinoma P rostate Specimen 
25 this example shows that ARSDRl is expressed in both normal prostate and 

prostate carcinoma. 

To confirm prostate-specific ARSDRl expression, ui siUi hybridizations were 
performed on sections of normal prostate using both, a sense and an antisense RNA 
probe specific for ARSDRl. A PCR product was generated from the 3' end of the 

30 ARSDRl using primer 6A4insitul 

(5'-TCTTCATTCAGAAAAATTATCTTAG-3')(SEQ IDNO:24) and 6A4insitu2 

(5'-GACAGTTCAATATAAATTAAGTAAAAC-3')(SEQ ED NO;25). The PCR 

product was cloned into PCIUI-TOPO (Invitrogen, Carlsbad, CA). The plasmid was then 

linearized at either end with BamHI or EcoRV, and transcribed to generate sense and 

35 anti-sense digoxigenin-labeled probes. Both, dig-dUTP labeled sense and anti-sense 

probes were constructed using a dig-RNA labeling kit according to .manufacturer's 

instructions (Boehringer Mannheim, Indianapolis, IN). In situ hybridization was 

performed on a Ventana Gen II automated instrument (Ventana Medical Systems, 
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Tucson, AZ). Formalin-fixed and paraffin-embedded prostate specimens were obtained 
from a previously surgical specimen tissue bank. The tissue sections (5^m) were 
mounted onto Proma plus slides (VWR Scientific, W, Chester, PA), deparaffmizeded in 
a65°C oven for 2 hours followed by three 5 minutes soaks in xylene and rehydrated 

5 through graded alcohol with a final rinse in 2XSSC. Prior to hybridization, the sections 
were digested with proteinase 1 cocktail for 12 minutes at ST'^C before applying 10 ng of 
either sense or anti-sense probe in the hybridization buffer. The probe was denatured 
at 65°C for 4 minutes and hybridized at 42"C for 6 hours. The tissue sections were then 
rinsed with 2X, IX and O.IX SSC at STX. The hybridization probe was detected with 

10 mouse anti-dig antibody and the signal was amplified by subsequent application of biotin 
conjugated anti-mouse antibody and streptavidin-horseradish peroxidase. The /// situ 
signal was then visualized by DAB and counter-stained with hematoxylin. 

ARSDRl was expressed in both the luminal secretary cells and the basal cells of 
the epithelia of normal prostate. Little to no hybridization was seen in stromal cells. No 

15 background hybridization to normal prostate tissue was seen with the sense ARSDRl 
probe. 

In situ hybridizations with ARSDRl' sense and antisense probes were also 
performed on sections of primary prostate adenocarcinoma obtained from radical 
prostatectomy specimens. ARSDRl was uniformly expressed in prostate 
20 adenocarcinoma cells as revealed by hybridization with anti-sense probes. Hybridization 
with ARSDRl sense probes showed no background hybridization to the tumor cells. 

EXAMPLE 9 

Determination of And ro gen-Re filiated and Pro sta te-Localized Expres sion of TMPRSS2 
This example confirms that expression of TMPRSS2 is androgen-regulated and 
25 that TMPRSS2 is highly expressed in normal and neoplastic prostate epithelium relative 
to other human tissues. 

Ti\lPRSS2 is a prostate-specific and androgen-regulated polynucleotide that 
encodes a 492 amino acid serine protease. Androgen-regulated expression of TMPRSS2 
was confirmed by Nonhem analysis using the same LNCaP RNA that was used to 
30 construct the probes for microarray hybridization. The LNCaP RNA was isolated using 
TRIzoI (Life Technologies, Germantown, MD) according to the manufacturer's 
direcrions. Ten of total RNA were firactionated on 1.2% agarose denaturing gels and 
transferred to nylon membranes by capillary method (Sambrook et al., T. Molecular 
Cloning . Cold Spring Harbor, New York, Cold Spnng Harbor Laboratory Press (1989)). 
35 Blots were hybridized with TMPRSS2 probes labeled with [alpha-32P]dCTP by random 
priming using the Random Primers DNA labeling kit (Life Technologies, Germantown, 
MD) according to the manufacturer's protocol. All DNA manipulations including 
transformation, plasmid preparation, gel electrophoresis, and probe labeling were 
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performed according to standard procedures described by Sambrook et al. (T, Molecular 
Cloning . Cold Spring Harbor, New York, Cold Spring Harbor Laborator>' Press (1989)). 
Filters were imaged and quantitated by using a phosphor-capture screen and Imagequant 
software (Molecular Dynamics, Sunnyvale, CA). Phosphorimage quantitation of the 

5 Northern demonstrated a nine-fold induction of TMPRSS2 expression after 72 hours of 
androgen exposure with synthetic androgen R1881 

TMPRSS2 expression was also studied in the prostate carcinoma cell lines 
LNCaF, DUH45, and PC3 as well as in androgen-dependent (PXe-AD) and 
androgen-independent (PXe-AJ) prostate cancer xenografts, and prostate stroma (PS). 

10 The prostate carcinoma cell lines LNCaP, DU145, and PC3 were cultured in RPMl 1640 
medium supplemented with 10% fetal calf serum (PCS) (Life Technologies, 
Germantown, MD). Twenty-four hours before androgen regulation experiments, LNCaP 
cells were transferred into RPMI 1640 media with 10% charcoal-stripped PCS (CS-FCS) 
(Life Technologies, Germantown, MD). This media was replaced with fresh CS-FCS 

15 media or CS-FCS supplemented with 1 nM of the synthetic androgen R18SI (NEN Life 
Science Products Inc., Boston, MA). Cells were harvested for RNA isolation at 0-, 1-, 
2-, 4-, 8-,' 24-, 48-, and 72-hour time points. Northern analysis was performed with total 
RNA isolated from cell lines, normal prostate tissue, and prostate cancer xenografts as 
described in Example 2. 

20 TMPRSS2 expression could be detected in the normal prostate tissue and the 

steady-state LNCaP cells grown in FCS, but was not detectable after 24 hours of 
androgen depletion. Northern blot analysis was performed using a TMPRSS2 probe with 
RNA extracted from normal prostate (NP), LNCaP at steady state (SS), LNCaP after 24 
hours of androgen deprivation (time=0), LNCaP at specified hours after androgen 

25 exposure (1, 2, 4, 8, 24, and 48 hours), the PC3 (PC3) and DUI 45 (DU145) prostate 
cancer cell lines, the androgen dependent (PXe-AD) and androgen-independent (PXe-AI) 
prostate cancer xenografts, and prostate stroma (PS). TMPRSS2 expression could be 
detected after 2 hours of androgen supplementation and increased steadily through 
the 48-hour time point. TMPRSS2 expression was not detectable in the 

30 androgen-unresponsive PC-3 and DU-145 cell lines, or in a short-term culture of prostate 
stroma consisting of fibroblasts and smooth muscle ceils. 

Normal secretory prostate epithelial cells and early-stage prostate carcinomas 
depend on androgens for growth. The emergence of an androgen-independent (AI) 
phenotype is a hallmark of advanced prostate cancer. In addition to AI proliferation, 

35 these neoplastic cells are also capable of androgen-independent PSA expression. 
Northern analysis was further utilized to examine the expression of TMPRSS2 in human 
prostate cancers propagated in a xenograft system that recapitulates the 
androgen-dependent (.AD) and subsequent AI characteristics of human prostate cancer 
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growth (Bladou et al., hit. J, Cancer 67:785-790 (1996)). TMPRSS2 was expressed in 
both the AD and AI tumors, a finding lhal parallels PSA expression in this system, 
indicating a passible dysregulation of TMPRSS2 control. 

The distribution of TMPRSS2 transcripts in normal human tissues was also 
5 determined by Northern analysis performed as described in Example 2.. Northern blot 
analysis of TMPRSS2 expression was performed using RNA from 16 human tissues. 
The human multiple tissue blots were obtained from Clontech (Palo Alto, CA) and 
contained 2 ^ig of (poly)A+ RNA in each lane. A beta-actin control probe was used to 
verify equivalent loading of RNA. Of 16 adult tissues examined, TMPRSS2 message 
10 was predominantly expressed in prostate tissues, with very low expression levels in 
colon, lung, liver, kidney, and pancreas, and no detectable expression in spleen, thymus, 
testes, ovary, peripheral leukocytes, heart, brain, placenta, or skeletal muscle. 

EXAMPLE 10 

TMPRSS2 Expression in Prostate Basal Cells and Prostate Carcinoma 
15 This example shows that TMPRSS2 is expressed in prostate basal cells and 

prostate carcinoma. 

Normal prostate contains two major epithelial cell populations, the luminal 
secretory cells and the basal cells. To localize TMPRSS2 expression, in situ 
hybridizations were performed on sections of normal prostate by using an antisense RNA 

20 probe specific for TMPRSS2. For mRNA in situ hybridization, recombinant plasmid 
pCRII-TOPO (Invitrogen. Carlsbad, CA) containing a 489 bp TMPRSS2 fragment 
(nt 513-1002 of the published TMPRSS^ sequence (Paoloni-Giacobino et al., 
Genomics 44: 309-20 (1997)) was linearized by restriction digest of the vector to 
generate sense and antisense digoxigenin-labeled RNA probes. /// situ hybridization was 

25 performed according to the manufacturer's protocol on a Ventana Genii automated 
instrument (Ventana Medical Systems, Tucson, AZ). Programmed recipe files consisting 
of buffer rinses, protease (digestion, hybridization, detection and counter- stains were 
optimized for the TMPRSS2 probe. Briefly, the optimized conditions were as follows: 
Digoxigenin-labeled RNA probe was added manually. Anti-digoxigenin (Ventana 

30 Medical Systems, Tucson, AZ) was used as the primary antibody. Denaturation was 
at 65^C and the hybridization was done at 42°C for 280 minutes. Washes were 
performed at 35°C with Ix, 0.5x. and O.lx saline sodium citrate (SSC). The system uses 
. a cocktail of anti-rabbit and ami-mouse secondary IgG-biotinylated antibody with an 
indirect biotin avidin diaminobenzidine (DAB) detection system. The sections were 

35 counter-stained with haematoxylin. 

The results of the above study showed that TMPRSS2 was expressed exclusively 
in the normal basal cell population. In situ hybridization with an antisense RNA probe 
for TMPRSS2 was done to assay TMPRSS2 expression in normal and malignant prostate 
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tissue. TMPRSS2 expression was observed in basal cells of normal prostate tissue, but 
not in secretory luminal epithelium. The in sfin images were digitally acquired and the 
staining intensity was enhanced to show contrast. Little to no staining was seen in 
stroma, secretory cells, or infiltrating lymphocyT:es. hi sfin hybridization experiments 
5 with sense. strand control TMPRSS2 probe showing no background staining in normal 
prostate tissue. //? situ hybridizations with TIVIPRSS2 antisense and sense probes were 
also performed on sections of primary prostate adenocarcinoma obtained from radical 
prostatectomy specimens. Adenocarcinoma cells were uniformly positive for T]yiPRSS2 
expression, In addition, TMPRSS2 expression was observed in primary prostate 
10 carcinoma cells. The sense strand control TMPRSS2 probe exhibited no background 
staining in cancerous prostate tissue. 

EXAAIPLE n 
Sequence Analysis of the Putative TMPRSS2 Promoter 
This example shows that the TMPRSS2 polynucleotide contains an androgen 
15 response element (ARE) in the 5' promoter region at nucleotides 576 to 590 of SEQ ID 
N0:9. 

To identify androgen regulatory sites, the DNA sequences upstream of the 
TMPRSS2 coding region was cloned by genome-walking in order. An 1100 base pair 
DNA fragment overlapping the TMPRSS2 cDNA by 100 nucleotides that contained 870 
- 20 base pairs of sequence 5' to the putative transcriptional start site was obtained using the 
GenomeWalker kit by Clontech (Palo Alto, CA). Libraries of adapter-ligated genomic 
DNA fragments were used as template for PCR reactions with the TMPRSS2 
gene-specific primer U75329-71R 5'-TGAGTTCAAAGCCATCTTGCTGTTATCAAC-3' 
(SEQ ID NO:26) and a primer corresponding ,to the library adapter sequence 

25 API 5'-GTAArACGACTCACTATAGGGC-3' (SEQ ID NO:27) according to the 
manufacturer's instructions. A nested PCR reaction with TMPRSS2 primer U75329-55R 
S'-CCATCCTAATACGACTGACXATAGGGC-j' (SEQ TD NO:28) and adapter primer 
AP2 5'-ACTATAGGGCACGCGTGGT-3' (SEQ ID NO:29) was performed. PCR 
products were cloned into the pCR2.1 vector (Invitrogen, Carlsbad, CA) and sequenced 

30 using M13 forward and M13 reverse primers. Nucleotide sequences were submitted for 
homology comparisons against the nonredundant public sequence databases using the 
BLAST server at the NCBI (http://www.ncbi.nlm.nih.gov/) The BLAST search 
parameter prompts utilized are the default prompts located at the NCBI BLAST website. 
. Sequences examined for promoter and potential transcriptional start sites using a neural 

35 network promoter prediction program (http://www-hgc.lbl.guv/projects/promoter.html) 
identified a 51 base pair sequence beginning 250 nucleotides 5' of the putative 
translational start site that correlates highly (score of 0.97 indicating a 0.1 % 
false-positive prediction rate) with consensus promoter elements. Sequences examined 
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for transcription factor binding sites using SIGSCAN 
(http:/^iInas.dcrt.^ih.gov/molbio/signal/) identified numerous putative 
transcription-factor binding sites including consensus sites for SPl, Z-box, API, and AP2 
regulation, a 15-bp sequence with significant homology to the consensus androgen 
5 response element (ARE) is located at nucleotides 576 to 590 of SEQ ID N0:9. 

EX.AMPLE 12 

, Preparation of T[VtPRSS2-Specific Antibody and Analysis of TN ^RSS2 Polypeptide 

Expression 

Polyclonal Antibody 

10 TMPRSS2 peptide sequences were selected by direct primary structure 

comparison between the members of the serine protease gene family and computer-aided 
antigenicity, surface probability and hydrophobicity analyses. The TMPRSS2 peptides 
for antibody production were selected based on the foliowing criteria: 1) the peptide 
sequence should be on the protein surface and preferably it is in .flexible loops; 2) the 
15 peptide sequence is at least 15 residues long; and 3) the number of cysteine and proline 
residues in the selected peptide sequence should kept to a minimum. In order to search 
for suitable immunogenic peptides in TMPRSS2, the three-dimensional structure of 
trypsin was evaluated and its loop regions (which are also on the protein surface) were 
identified. The primary sequence of trypsin was aligned with that of TMPRSS2, and the 
. 20 corresponding loop regions in TMPRSS2 was deduced. Table 2 sets forth the TMPRSS2 



peptide sequences that were selected using the these criteria. 
TABLE 2 TMPRSS2 Peptides for Immunization 


Peptide 


Sequence 


Residues i Rabbit Sera 


SEQ IDNO:30 


KVISHPNYDSKTKNNDIC 


330-346 


6623, 6624 


SEQ IDNO:31 


KLQKPLTFNDEVKPVC 


350-365 


6621, 6622 


•SEQ1DN0:32 


CWISGWGATEEKGKTSEV 


378-396 


6619, 6620 



The peptides shown in Table 2 were synthesized and then conjugated with 
keyhole limpet hemocyanin (KLH) for immunizing rabbits. Conjugated peptides and 

25 whole proteins were used for the production of rabbit polyclonal antibodies. These 
procedures were contracted to the biotechnology company Research Genetics, Inc. 
(Huntsville, AL). The rabbit anti-TMPRSS2 sera designated in Tabic 2 were obtained 
one week following the second boost with each referenced TMPRSS2 peptide antigen. 
Western blot analysis on lysates from LNCaP cells starved or stimulated with androgens 

30 was performed with anti-TNtPRSS2 antibody. An induction of TMPllSS2 polypeptide 
was observed using the anti-TlVIPRSS2 antibody upon androgen administration. No 
TMPRSS2 polypeptide was detected in DU145 or PC3 cells which are non-responsive to 
androgen. 
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Immunohistochemichal analysis of TMPRSS2 polypeptide expression using 
polyclonal antibodies raised against the protease domain of the TMPRSS2 polypeptide 
was performed with normal prostate and prostate carcinoma tissue sections. Normal 
prostate tissue showed immuno-staining of both basal and luminal epithelial cells, No 
5 stromal cell staining was apparent. Prostate carcinoma tissue exhibited variable staining, 
intensity in individual neoplastic cells using TMPRSS2 polyclonal antibody. No 
reactivity was observed with control non- immune IGG. 
Monoclonal Antibody 

TMPRSS2 is expressed in mammalian cells in order to produce soluble proteins 
10 with suitable post-translational modifications that closely resembles the form of the 
protein m physiologic sources. The TMPRSS2 foil length cDNA sequence shown as 
SEQ ID NO: 3 is cloned into the plasmid pGT-d (Berg et. al., Biotechniqites 14:972-978 
(1993)) and transfected into the AV12 hamster cell line (ATCC CRL 9595) as described 
previously for the expression of recombinant hK2 protein (Charlesworth et al.. Urology 
15 49:487-493 (1997)). Alternatively, the TMPRSS2 cDNAs is cloned into the pLNSX and 
pLNCX retroviral expression vectors (Miller et al., Bioiechmques 7:980-2, 984-6, 989- 
90, (1989)). Stable transfectants are isolated under drug resistance. Individual clones are 
isolated, expanded, and checked for protein expression by Western blot using the 
polyclonal antibodies described in Example 12. TMPRSS2 peptides and polypeptide are 
20 used to generate monoclonal antibodies by contracting Immgenics Pharmaceuticals, Inc. 
(Vancouver, British Columbia), Briefly, six- week-old A/J mice (Jackson Laboratories) 
are immunized with two intraperitoneal injections of selected immunogen, titers are 
checked and the mice are boosted with intravenous administration of the T1VIPRSS2 
polypeptide. Hybridomas are produced by fusion of mouse splenocytes with P3.653 
25 myeloma cells (Kohler et al, Natwe 256:495-497 (1975)). Monoclonal antibodies are 
selected based upon reactivity with TMPRSS2 as well as the failure to react with PSA 
and hK2 serine proteases using ELISA. Hybridomas are then expanded and antibodies 
are produced in vitro by mass culture or hollow fibers. 
ELISA Analysis 

30 An ELISA method for the quantitative screening of patient sera for TMPRSS2 

polypeptide is developed using a sandwich ELISA assays as previously described for 
prostate-specific antigen (PSA) (Corey et al.. Ini J Cancer 7L 1019- 1028 (1997). Briefly, 
all combinations of monoclonal or polyclonal antibodies lo TMPRSS2 are tested in 
sandwich assays to determine the pair with the highest sensitivity and specificity. 
35 Female sera spiked with different concentrations of recombinant TMPRSS2 polypeptide 
is used as a control, and to construct a standard curve. 

KXAiVIPLE 13 
RNABlot Analysis of PART- 1 Expression 
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This example corroborates by RNA btot analysis ihe microarray hybridization 
results demonstrating androgen-induced up-reguiation of PART- 1 . 

A RNA or Northern blot containing the same RNAs used for the microarray 
hybridization was hybridized to PARf-l cDNA. The RNA blots were made by 
5 fractionating 10 ^ig total RNA on a 1 .2% formaldehyde gel and blotting (Sambrook et al., 
T. Molecular Cloning, Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press 
(1989)). The PART-1 cDNA probe was labeled with [(3-^2 p] dCTP (Amersham, 
Piscataway, NJ) using a rediprime II random primer labeling system (Amersham, 
Piscataway, NJ) and the probes were purified with Sephadex G50 Nick column 

10 (Pharmacia, Kalamazoo, MI), Northern hybridization confirmed the microarray 
hybridization results that PART-1 is up-regulated by androgens. The same blot was also 
hybridized to PSA and G3PDH, PSA was shown to be strongly stimulated by androgens, 
consistent with previous observation (Montgomery et al, Prostate 21:63-73 (1992)). 
The amount of RNA loaded on each lane of the northern blot was similar according to 

15 G3PDH hybridization. Quantification utilizing the ImageQuant program (Molecular 
Dynamics, Sunnyvale, CA) revealed that PSA and PART-1 expression levels in androgen 
stimulated versus androgen starved LNCaP cells are 25.4 and 3.5 times higher, 
respectively. 

The distribution of PART-1 transcripts in normal human tissues was also 
20 determined by Northern blot analysis. An RNA Master Blot was purchased from 
Clontech (Palo Alto, CA) and Northern hybridization was carried out in ExpressHyb 
hybridization solution (Clontech). The Northern blot was exposed to a phosphor screen 
(Molecular Dynamics) and the images were scanned into a computer with a 
Phosphorimager. Quantification was done using ImageQuant program (Molecular. 
25 Dynamics). Hybridization of PART-1 cDNA probes to a Clonetech RNA Master Blot 
revealed that PART-1 is expressed most abundantly in prostate with little or no 
expression detected in colon, lung, liver, kidney, pancreas, spleen, thymus, testes, ovary, 
peripheral leukocytes, heart, brain, placenta, and skeletal muscle. 

EXAMPLE 14 

30 Isolation of the Full Length cDNA for PART-1 

This example shows cloning of the full length cDNA for PART-1 and 
determination of its nucleotide sequence. 

To clone PART-1, two rounds of 5' Rapid amplification of cDNA ends (5'RACE) 
from human Marathon-ready prostate cDNAs (Clontech) and from androgen stimulated 
35 LNCaP cDNAs were performed using the Marathon cDNA amplification kit (Clontech) 
according to manufacturer's protocol. 5' RACE was carried out according to Clontech's 
protocol. The first round of 5'RACE was performed with primers 14D7-196L 
(5'-GTGACGGTCTTGGACAGTA.AGGG-3')(SEQ ID NO:33) and 14D7-85L 
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(5*-AGAGTATTGTTGGCTTTGTCTGTC-3'XSEQIDNO:34). The second round of 5' 
RACE was performed with primers 14D7RC3 

(5-CTTTCCCCTCCGACAAGGAAGCTG-3*)(SEQ ID NO:35) and 14D7RC4 
(5-CTCATCTGTGTTGTTCCAGTGCAGCC-3*)(SEQ ID NO:36). The products 

5 were then subcloned into PCR2.1-TOP0 vectors with the TOPO TA cloning kit 
(Invitrogen) and sequenced. In the second round of RACE using primers 14D7RCj and 
14D7RC4, a 300 bp band was. obtained from both, human Marathon-ready prostate 
cDNAs.(Clontech) and androgen stimulated LNCaP cDNAs made by Marathon cDNA 
amplification kit (Clontech). Sequence analyses of 8 individual RACE clones from both 

10 cDNA sources revealed that they all have the same 5' end base, indicating that it is the 
end of the PART-1 cDNA. Overall, a total of 2109 bp were obtained. This resuh 
corresponds to a 2.1 kilobases band that was obser\'ed on a Northern biot. 

PART-1 cDNA encodes a 60 amino acid polypeptide (SEQ ID N0:6). The 
translational start site conforms to the Kozak consensus motif for translational start site in 

15 an adequate context (Kozak, Mawmaliati Genome 7:563-574 (1996)). The PART-1 
polypeptide has no homology to any known proteins in the database by BLAST and 
FASTA searches. BLOCKS searches (http://www.blocks.fhcrc.org) (Henikoff et al.. 
Nucleic Acids Research^ 27:204-208 (1999)) revealed that the PART-1 polypeptide has a 
XPG_1 BLOCK.XPG^l BLOCK as found in the DNA-damage inducible gene Din7 

20 from yeasl (Mieczkowski et al., Molecular and General Genetics 255:655-665 (1997)) 
and in the XPG DNA repair endonuclease (O 'Donovan et al.. Journal of Biological 
Chemistry 269: 15965-15968(1994)) as weiPas in exonuclease \ of yeast (Fiorentini et 
Z.U Molecular Cell Biology, 17:2764-2773 (1997)). In addition, the PART-1 polypeptide 
has two protein kinase C phosphorylation sites and one tyrosine kinase site. Based on 

25 BLAST and FASTA database searches it has no homology to any known proteins. A 
polyadenylation signal AAUAAA (Fitzgerald and Shenk, Cell 24:251-260 (1981)) was 
identified at 633 and 1558 nucleotides 3' of the TAG stop codon. Also, a common 
natural variant of the polyadenylation signal AUUAAA (Wilusz et al.. Nucleic Acid 
Research, 17:3899-3908(1989)) was identified at 644 and 2054 nucleotides 3' of the 

30 TAG stop codon (SEQ ID NO: 5). 

EXAMPLE 15 

Isolation of the PART-1 Promoter Region bv Genomic Walking 
This example shows cloning and sequence analysis of the PART-1 promoter 

region. 

35 The Human GenomeWalker kit (Clontech) was used to clone the promoter region 

of the PART-1 cDNA . with primers 14D7RC3 and API 
(5'-GTAATACGACTCACTATAGGGC)(SEQ ID NO:27)(ClonLech). Each Genome 
Walker kit contains five premade "libraries" constructed by digesting human genomic 
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DNA with 5 enzymes EcoR V, Sea 1, Dra I, Pvu 11 and Ssp 1, and iigating the restriction 
fragments to specific adaptors. PGR was performed with an initial incubation at 94°C for 

3 minutes, followed by 5 cycles at 94°C for 25 seconds, followed by 5 cycles at 72^C for 

4 minutes, followed by 22 cycles at 94='C for 25 seconds, followed by 22 cycles at 6TC 
5 for 4 minutes and a final extension at 67°C for 7 minutes. This genomic walk produced 

a 1.3 kilobases (kb), a 2.3 kb and 0.8 kb band respectively from the Dra I, Pvu II and Ssp 
I human GenomeWalking libraries. The 2.3 kb band obtained from the Pvu IT library 
was cloned into a PCR2.I-T0P0 vector (Invitrogen) and 2325 base pairs of sequence 
were obtained. The sequences were examined to identify a potential transcriptional start 

10 site using a neural network promoter prediction program 
(http : www. hgc. lbI.gov/proj ects/promoter. htm!) 

(Reese et al., Large Scale Sequencing Specific Neural Networks fpr Promoter and 
Splice Site Recognition, Biocomputing: Proceedings of the 1996 Pacific Symposium, ed. 
Lawrence Hunter and Terri E. Klein, World Scientific Publishing Company, Singapore 

15 (1996)), and for transcriptional factors using the TESS (Transcription Element Search 
Software) program (http://www.cbil.upenn.edu/tess/index.html) (Schug and Overton, 
TESS: Transcription Element Search Software on the WWW, Technical Report CBTL- 
TR-1997-lOOl-vO.O of the Computational Biology and Informatics Laboratory, School of 
Medicine, University of Pennsylvania (1997)). A Dra I and "a Ssp I site were found in the 

20 sequences corresponding to the respective 1.3 and 0.8 kb genomic walking PGR bands 
from the Dra I and Ssp I libraries. 

The PART-1 genomic walking sequence extends 2024 bps 5' of the start of the 
PART-1 cDNA. A TATA box (TATAAAA) was identified at nucleotides 1484 to 1491 
of SEQ ID N0:11. A putative transcriptional start site (TGTCTTCAAT) is predicted 

25 at 30 nucleotides 5' of the TATA box. In addition, a binding site for the homeo-domain 
containing protein Pbx-la (Van Dijk et al., Prod Nai. Acad. Sci. (1993)) was identified 
at nucleotides 536 to 544 of SEQ ID N0:1 1. The PART-1 promoter region also contains 
a binding site for NFAT-1 (nuclear factor of activated T cells) at nucleotides 926 to 935 
of SEQ TD N0:9 (Rao., Immunol Today 15:274-281 (1994)). 

30 Nine putative polymorphisms for the PART-1 polynucleotide were identified 

(Table 3). These polymorphisms were sequenced verified and either form is represented 

in at least two clones. 

TABLE 3. Summary of the Polymorphisms Found in PART-1 
Base Changes 

35 

Nucioeotide Position in Region Polymorhism 

SEQ ID NO: 

2230 SEQ ID NO: 11 Promoter T -> C 
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Nucloeotide Position in Region Polymorhism 

SEQ ID NO: 



1835 


SEQ ID NO; 11 


Promoter. 


C->T 


1807 


SEQ ID NO: 1 1 


Promoter 


A->G 


1499 


SEQ ID NO: 11 


Promoter 


G->C 


2088 


SEQ ID NO: 11 


Promoter 


T ->C 


223 


SEQIDN0:5 


cDNA 


T->C 


5S9 


SEQ ID NO: 5 


cDNA 


T->C 


611 


SEQ ID N0:5 


cDNA 


G-> A 


1856 


SEQ ID NO: 5 


cDNA 


T-> A 



EXAMPLE 16 
Chromosomal Localization of PAKT-1 
This example shows the chromosomal localization of PART-1 by both, 
5 polymerase chain reaction (PCR) typing and fluorescence in situ hybridization (FISH). 

The medium-resolution Stanford G3 radiation hybrid panel was used to map the 
chromosomal localization of PART-1 with primers 14D7mapR 
(5'-TGCTTTGTTAAGATGAGGCAGGC-3')(SEQ ID NO:37) and 14D7mapF 
(5*-CATTCCAGGTGTCATGGATAAAGAGC-3')(SEQ ID NO:38). The PCR was 

10 performed with an initial incubation at 94°C for 2 minutes, followed by 35 cycles 
of 94**C for 30 seconds, followed by one cycle at 54°C for 30 seconds and a final cycle 
at 72°C for 30 seconds. The reaction producrs were separated on a 1.2% agarose gel and 
the resuhing product pattern was analyzed through the Stanford genome web server 
(www.shgc.stanford.edu) to deterniine the probable chromosomal location. Analysis of 

15 the typing results indicates that PART-1 is mapped closest to SHGC-14390 on 
chromosome 5 with a led score of 8.60 and a cR 10,000 distance of IScRS. SHGC-14390 
is mapped between markers D5S2376 and D52604. PART-1 cDNA probe was used to 
screen an arrayed human BAC genomic library (Research Genetics, Huntsyille, AL). 
Three positive clones 370E 12, 493B12 and 5p8J22 were identified and confirmed by 

20 PCR using primers 14D7mapR and 14D7mapF. BAC DNA was biotinylated by nick 
translation, prehybridized with human Cot I DNA (Gibco-BRL) and then hybridized to 
metaphase spreads of a normal male as described previously (Trask, B.J., Fluorescence in 
situ hybridization , Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York 
(1997)). After hybridization and washing, the hybridized sites ^ere labeled with 

25 fluorescein-conjugated avidin and the chromosome was counter-stained with DAPI to 
produce a QFH-like banding pattern. Images were, dighized as described elsewhere 
(Wise et al., Genome Research 7: 10-16(1997)). Ten well-spread and well-banded 
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metaphases were analyzed to localize the hybridization signals. This confirmed that 
PART-i is mapped to chromosome 5ql2.1. 

EX^^VIPLE 17 

Identiftcatioii. Isolation. and Characterization of 8C3, an Androsen-Regulated and 
5 Prostate-Specific cDNA 

Identification, Characterization, Cloning, and Chromosomal Localization of 8G3 
was performed according to essentially the same methods described above in Examples 
1, 2 and 5 for ARSDRl. 

The chromosomal localization of8G3 was mapped utilizing primers 8C3mapR 
10 (5'-TGGCTTCCTCCCTCCATTTTAGAG-3')(SEQ ID NO:39) and API (Clontech, Palo 
Alto, CA) - in the first round, and primers 8C3mapF 
(5*-GGTGTCAAAAAACTGGCACATCAG-3')(SEQlD NO:40) and AP2 (Ciomech, 
Palo Alto, CA) in the second round. The PCR was performed with an initial incubation 
at 94°C for 30 seconds, followed by one cycle at 54°C for 30 seconds and 35 cycles 
15 at 72°C for 30 seconds. 

Two cycles of 5' RACE were performed essentially as described above for 
PART-1 in Example 15 using primer 170L- 

(5'-CTGGAGTGACACAGCGAGACCC-3')(SEQ ID N0:41) in the first round, followed 
by 5'RACE PCR as follows: one cycle at 94°C for 30 seconds, 5 cycles at 94°C for 5 
20 seconds followed by 72" C for 4 minutes, 5 cycles at 94^ C for 5 seconds followed by 70" 
C for 4 minutes, and 20 cycles at 94TTor 5 seconds followed by 68° C for 4 minutes. In 
the second round primer 43L (5'-CTGATGTGCCAGTTTTTTGACACC-3')(SEQ ID 
NO:42) was amplified and 5' RACE PCR was performed as follows: one cycle at 94"C 
for 30 seconds, 2 cycles at 94°C for 5 seconds followed by 72^* C for 4 minutes, and 2 
25 cycles at 94° C for 5 seconds followed by 70*' C for 4 minutes. 

While the preferred embodiment of the invention has been illustrated and 
described, it will be appreciated that various changes, can be made therein without 
departing from the spirit and scope of the invention. 
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The embodiments of the invemion in which an exclusive property or privilege is 
claimed are defmed as follows: 

,1. An isolated polynucleotide capable of hybridizing under stringent 
condition to at least 15 contiguous nucleotides from a nucleotide sequence selected from 
the group consisting of SEQ ID N0:1, SEQ ID N0:3, SEQ ID N0:5. SEQ ED N0:7, 
SEQ ID NO: 8, SEQ ID NO:9, SEQ ID NO: 10 and SEQ ID NO: U . 

2. A polynucleotide of Claim 1, wherein said fragment comprises 
substantially the nucleotide sequence shown as nucleotides 1 to 3,1 13 of SEQ ID N0:8, 
or functional fragment thereof 

3. A polynucleotide of Claim 2, wherem said ftinctional fragment comprises 
an androgen response element shown as nucleotide number 2,246 to 2,259 of SEQ ED 
N0:8. 

4. A pulynucieolide of Claim 2, wherein said functional fragment comprises 
a progesterone responsive element shown as nucleotide numbers 2,175 to 2,189 or 2,627 
to 2,641 ofSEQIDNO:8. 

5. A polynucleotide of Claim 1, wherein said fragment comprises an 
androgen response element shown as nucleotides 576 to 590 of SEQ ID N0:9. 

6. A polynucleotide of Claim l,Vherein said fragment comprises a Pbx-la 
regulatory fragment shown as nucleotides 536 to 544 of SEQ ID NO: 1 1 . 

7. A substantially pure polynucleotide probe comprising at least 15 
contiguous nucleotides from a nucleotide sequence selected from the group consisting of 
SEQ ID N0:1, SEQ ID N0:3, SEQ ID N0:5, SEQ ID N0:7, SEQ ID N0:8, SEQ ID 
N0:9, SEQ ID NO: 10 and SEQ ID NO: 11. 

8. A nucleic acid probe of Claim 7, which comprises an oligonucleotide of 
15-18 nucleotides in length. 

9. A nucleic acid probe of Claim 7, further comprising a detectable label 

10. A substantially pure polypeptide comprising substantially an amino acid 
sequence selected from the group consisting of the sequences shown as SEQ ID N0:2, 
SEQ ID N0:6, and functional fragment thereof 
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U. A method of diagnosing or predicting the susceptibiiity of a prostate 
neoplastic condition in an individual suspected of having a neoplastic condition of the 
prostate, comprising: 

(a) obtaining a fluid sample from an individual; 

(b) determining an expression leve] of at least one polypeptide chosen 
from the group consisting of ARSDRl, TMPRSS2, and PART-1: and 

(c) comparing said measured expression level of said chosen 
polypeptide to a normal expression level of said chosen polypeptide from a normal fluid 
sample, wherein said measured expression level for said chosen polypeptide of2-foid or 
more ifrom said fluid sample from said individual compared to said normal expression 
level indicates the presence of a prostate neoplastic condition. 

12. " The method of Claim 11, wherein said fluid sample and said .normal fluid 
sample are selected from the group consisting of blood, serum, urine and semen. 

13. The method of Claim 1 1, wherein .said expression level is determined by 
measuring the amount of RNA encoding the chosen polypeptide. 

14. The method of Claim 11, wherein said expression level is determined by 
measuring the activity of said chosen polypeptide. 

15. A method of diagnosing or predicting the susceptibility of a prostate 
neoplastic condition in an individual suspected of having a neoplastic condition of the 
prostate, comprising: 

(a) obtaining a prostate cell sample of the individual; 

(b) determining an expression level of at least one polypeptide chosen 
fi-om ARSDRl, TMPRSS2, and PART-1; and 

(c) comparing said measured expression level of said chosen 
polypeptide to a normal expression level of said chosen polypeptide from normal prostate 
cells or from an androgen-dependent cell line, wherein said measured expression level 
for said chosen polypeptide of 2-foId from said individual compared to normal prostate 
cells or from an androgen-dependent cell line indicates the presence of a prostate 
neoplastic condition. 

16. The method of Claim 15, wherein said expression level is determined by 
measuring the amount of RNA encoding the chosen polypeptide. 

17. The method of Claim 16, wherein said chosen polypeptide is ARSDRl 
and the amount of RNA is determined by hybridization with a polynucleotide probe 
comprising substantially the nucleotide sequence of SEQ ID N0:1, or fragment thereof 
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18. The method of Claim 17, wherein said fragment of said polynucleotide 
probe ftjrther comprises an oligonucleotide of about 15-18 nucleotides in length. 

19. The method of Claims 18. wherein said polynucleotide probe further 
comprises a detectable label. 

20. The method of Claim 16, wherein said chosen polypeptide is TMPRSS2 
and the amount of RNA is determined by hybridization with a polynucleotide probe 
comprising substantially the nucleotide sequence of SEQ ID N0:3, or fragment thereof 

21. The method of Claim 20, wherein said fragment of said polynucleotide 
probe flinher comprises an oligonucleotide of about 15-18 nucleotides in length. 

22. The method of Claims 21, wherein said polynucleotide probe further 
comprises a detectable label. 

23. The method of Claim 16, wherein said chosen polypeptide is PART-1 and 
the amount of KNA is determined by hybridization with a polynucleotide probe 
comprising substantially the nucleotide sequence of SEQ ID N0:5, or fragment thereof 

24. The method of Claim 23. wherein said fragment of said polynucleotide 
probe further comprises an oligonucleotide of about 15-18 nucleotides in length. 

25. The method of Claims 24, wherein said polynucleotide probe further 
comprises a detectable label. 

26. The method of Claim 15, wherein said chosen polypeptide is ARSDRl 
and said amount of polypeptide is determined by contacting a cell, a cell lysate, or 
fractionated sample thereof, from said individual with a binding agent selective for 
ARSDRl, and determining the amount of selective binding of said agent. 

27. The method of Claim 26, wherein said binding agent selective for 
ARSDRl further comprises an antibody or a non-hydrolizable shon-chain 
dehydrogenase/reductase substrate analog. 

28. The method of Claim 27, wherein said binding agent further comprises a 
detectable label. 

29. The method of Claim 15, wherein said chosen polypeptide is TMPRSS2 
and said amount of polypeptide is determined by contacting a cell, a cell lysate, or 
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fractionated sample thereof, from said individual with a binding agent selective for 
TMPRSS2, and determining the amount of selective binding of said ageiil, 

30. The method of Claim 29. wherein said binding agent selective for 
TMPRSS2 further comprises an antibody or a non-hydrolizable serine protease substrate 
analog. 

31. The method of Claim 30, wherein said binding agent further comprises a 
detectable label. 

32. The method of Claim 15, wherein said chosen polypeptide is PART-1 and 
said amount of polypeptide is determined by contacting a cell, a ceil lysate, or 
fractionated sample thereof, from said individual with a binding agent selective for 
PART- 1, and determining the amount of selective binding of said agent. 

33. The method of Claim 32, wherein said fractionated sample further 
comprises a lipid membranes. 

34. The method of Claim 33, wherein said binding agent selective for 
PART-1 further comprises an antibody. 

35. The method of Claim 34, wherein said binding agent flinher comprises a 
detectable label. 

"+ 

36. The method of Claim 15, wherein said expression level is determined by 
measuring an activity of said chosen polypeptide. 

37. The method of Claim 6, wherein said chosen polypeptide is ARSDRl and 
said activity is determined by contacting a cell, a cell lysate, or fractionated sample 
thereof, from said individual with a short-chain dehydrogenase/reductase substrate 
selective for ARSDRl, and determining the amount of product formed by ARSDRl. 

38. The method of Claim 37, wherein the amount of said product formation is 
determined by measuring the appearance of reduced coenzyme. 

39. The method of Claim 37, wherein the amount of said product fonnation is 
determined by measuring the disappearance of non-reduced coenzyme.- 

40. The method of Claim 37, wherein the amount of said product formation is 
determined by measuring the appearance of said product. 
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41. The method of Claim 37. wherein the amount of said product formation is 
determined by measuring the disappearance of said substrate. 

42. The method of Claim 36. wherein said chosen polypeptide is TMPRSS2 
and said activity is determined by contacting a cell, a cell lysate, or fractionated sample 
thereof, from said individual with a serine protease substrate selective for TMPRSS2, and 
determining the amount of product formed by TMPRSS2 . 

43. The method of Claim 42, wherein said fractionated sample ftirther 
comprises lipid membranes. 

44. A method of identifying a compound that inhibits the activity of ARSDRl 
comprising contacting a sample containing ARSDRl and a .\RSDR1 substrate with a test 
compound under conditions that allow product formation from said ARSDRl substrate, 
and measuring the amount of said product formation from said ARSDRl substrate, 
wherein a decrease in the amount of said product formation in the presence of said test 
compound compared to the absence of said test compound indicates that said compound 
has ARSDRl inhibitory activity. 

45. The method of Claim 44, wherein the amount of said product formation is 
determined by measuring the appearance of reduced coenzyme. 

46. The method of Claim 44, wherein the amount of said product formation is 
determined by measuring the disappearance of non-reduced coenzyme. 

47. The method of Claim 44, wherein the amount of said product formation is 
determined by measuring the appearance of said product. 

48. The method of Claim 44, wherein the amount of said product formation is 
determined by measuring the disappearance of said substrate. 

49. The method of Claim 44, wherein said sample further comprises prostate 
tissue, a prostate cell population or a recombinant cell population expressing ARSDRl. 

50. The method of Claim 44, wherein said sample further comprises a prostate 
cell lysate, a recombinant cell lysate expressing ARSDRl, an hi vitro translation lysate 
containing .ARSDRl mRNA, a fractionated sample of a prostate cell lysate, a 
fractionated sample of a recombinant cell lysate expressing ARSDRl, a fractionated 
sample of an in vitro translation lysate containing ARSDRl mRNA or an isolated 
ARSDRl polypeptide. 
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51. A method of identifying a compound that inhibits the activity of 
TMPRSS2 comprising contacting a sample containing TMPRSS2 and a TMPRSS2 
substrate with a test compound under conditions that allow cleavage of said TMPRSS2 
substrate, and measuring the amount of cleavage said TMPRSS2 substrate, a decrease in 
the amount of cleavage of said TiVIPRSS2 substrate in the presence of said test 
compound compared to the absence of said test compound indicates that said compound 
has TMPRSS2 inhibitory activity. 

52. The method of Claim 51, wherein said sample fijrther comprises prostate 
tissue, a prostate cell population or a recombinant cell population expressing TMPRSS2. 

53. The method of claim 5L wherein said sample further comprises a prostate, 
cell lysate, a recombinant cell lysate expressing TMPRSS2, an in vitro translation lysate 
containing TMPRSS2 mRNA, a fractionated sample of a prostate cell lysate, a 
fractionated sample of a recombinant ceil lysate expressing TMPRSS2, a fractionated 
sample of an in vitro translation iysate containing TMPRSS2 mRNA or an isolated 
TMPRSS2 polypeptide. 

54. A method of treating or reducing the progression of a prostate neoplastic 
condition, comprising administering to an individual having a neoplastic condition- of the 
prostate an inhibitory amount of a selective inhibitor of at least one prostate specific 
polypeptide chosen from ARSDRl, TMPRSS2, and PART-1, wherein said inhibitory 
amount causes a reduction of at least about 2-fold in the amount or activity of said 
chosen polypeptide. 

55. The method of Claim 54, wherein said chosen polypeptide is ARSDRl 
and said selective inhibitor is a short-chain dehydrogenase/reductase inhibitor. 

56. The method of Claim 55, wherein said selective inhibitor is an ARSDRl 
antisense polynucleotide. 

57. The method of Claim 56. wherein said selective inhibitor binds. to the 
ARSDRl 5' promoter and regulatory, region and inhibits transcription of ARSDRl . 

58. The method of Claim 54, wherein said chosen polypeptide is, TMPRSS2 
and said selective inhibitor is a serine protease inhibitor. 

59. The method of Claim 58, wherein said selective inhibitor is a TMPRSS2 
antisense nucleic acid. 
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60. The method of Claim 59, wherein said selective inhibitor binds to the 
TMPRSS2 5* promoter and regulator^' region and inhibits transcription of TMPRSS2. 

61. The method of Claim 54. wherein said chosen polypeptide is PART-1 and 
said selective inhibitor is a PART- 1 antisense nucleic acid. 

62. The method of Claim 61, wherein said selective inhibitor binds to the 
PART-l 5' promoter and regulatory region and inhibits transcription of PART-1. 

63. An antibody that binds specifically to a polypeptide having an amino acid 
sequence selected from the group consisting of SEQ ID N0:2, SEQ ID NO:4, and 
SEQ ID NO: 6, or a fragment thereof 

64. The antibody of Claim 63 wherein the polypeptide has an amino acid 
sequence substantially similar to the sequence shown in SEQ ID NO:4 and the antibody 
binds specifically to an epitope in the protease domain of SEQ ID N0:4. 

65. The antibody of Claim 64 wherein the epitope is within an amino acid 
sequence selected from the group consisting of SEQ ID NO;30, SEQ ID N0:31 and 
SEQIDNO:32. 
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gctggagcat cccgccctgg tgccgccgca ycicygcagag atggtcgagc tc atg zzc 58 -> 

Met Phe □ 

1 - 

G 

ccg ctg ttg etc etc ctt ctg ccc ttc ct- etc tat acg get gcg cee 106 ' , □ 

Pro Leu Leu Leu Leu Leu Leu Pro Phe Leu Lsu Tyr Met Aia Ala Pro J 

5 10 i5 2 



SUBSTmnX sheet (rule 26) 
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1 



202 



250 



caa ate agg aaa atg ctg tec agt ggg cig trc aca cca act gtc cag 154 
Gin lie Arg Lys Met Leu Ser Ser Giy Vai Cys Thr Ser Thr Val Gin 
20 15 ' 30 

ctt cct ggg aaa gra gtt gzc arc aca (77a .7-- aar. ar.a ggr. ate egg 
Leu Pro Gly Lys Val Val Val Vai Thr Gly .-.la Asn Thr Gly lie Giy 
35 40 45 SO 

aag gag aca gcc aaa gag ctg get cag aga cca get cga gta tat t::a 
Lys Glu Thr Ala Lys Glu Leu Aia Gin Arg Cily Ala Arg vai Tyr Leu a 
55 50 55 □ 

get tgc egg gat gtg gaa aag ggg gaa ttg ctg gcc aaa gag ate cag 298 O 
Ala Cys Arg Asp Val Glu Lys Gly Glu Lau Vai Ala Lys Glu Ila Gin U 

70 75 eo □ 

acc acg aca ggg aac cag cag gtg ttg gtg egg aaa ctg gac ctg tcL 346 □ 
Thr Thr Thr Gly Asn Gin Gin Val Leu Val Arg Lys Leu Asp Leu Ser . C 

95 90 95 □ 

0 

394 Q 

Q 



gat act aag tct att cga get ttt get aag ggc ttc tta get gag gaa 
Asp Thr Lys Ser lie Arg Aia Phe Ala Lys Giy Phe Leu Aia Glu Glu 

100 105 110 ^ ■ □ 

C 



aag cac etc cac gtt ttg ate aac aat gca gga gtg atg atg tgt ccg 
Lys His Leu His Val Leu lie Asn Asn Ala Gly Val Met Met Cys Pro 
115 120 125 " 130 



442 



490 



tac teg aag aca gca gat ggc ttt gag atg cac ata gga gtc aac cac 
Tyr Ser Lys Thr Ala Asp, Giy Phe Glu Met His lie Gly Val Asn His C 
135 140 145 a 



□ 

538 □ 
□ 



ttg ggt cac etc etc eta acc cat ctg ctg eta gag aaa eta aag gaa 

Leu Gly His Phe Leu Leu Thr His Leu Leu Leu Glu Lys Leu Lys Glu 

150 155 160 □ 

□ 

tea gcc cca tea' agg ata gta aat gtg tct tec etc gca cat cac ctg 586 □ 

Ger Ala Pro Ser Arg lie Val Asn Val Ser Ser Leu Ala His His Leu □ 
165 170 175 



gga agg ate cac ttc cat aac ctg cag ggc gag aaa ttc tac aat cca 
Gly Arg He His Phe His-Asn Leu Gin Gly Glu Lys Phe Tyr Asn Ala 
130 IBS 190 



n 
□ 

634 □ 
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ggc ctg gcc -ac tgt cac age iag cza. zz- adc acc c-c ttc acc cag S92 

Gly Leu Ala Tyr Cys His S^r Lys Leu .--la A5r. lie Leu ?he Thr Gin . I! 

195 200 , 20= 210 Z 

gaa ctg gcc egg aga c'a aaa ggc zcr. ggc g-- ==g acc -at cct gta 730 □ 

Glu Lea Ala Arn Arg Leu Lys Giy Ser Gly Val T^r Thr Tyr Ser Va: 2 

215 220 225 3 

cac cct ggc aca gtc caa cct gaa ctg gtt egg cac tea tct ttc atg 778 3 

His Pro Gly Thr Val Gin Ser Glu Leu V^i Arc His Ser Ser ?he Met 3 

230 . 235 240. 3 

0 

aga tgg atg tgg tgg ctt ttc tec ttt ttc ate aag act cct cag cag 826 □ 

Arg Tr? Met: Trp Trp Leu ?he Ser Phe Phe lie Lys Thr Pro Gin Gin □ 

245 250 255 ^ 

gga gcc cag acc age ctg cac tgt gcc tta aca gaa ggt ctt gag att 874 

Gly. Ala Gin Thr Ser Leu His Cys Ala Leu .Thr Glu Gly Leu Glu lie U 

260 265 270 0 



eta agt ggg aat cat ttc agt gac tgt cat gtg gea tgg gtg tct gtc 922 
Leu Ser Gly Asn His Phe Ser Asp Cys His Val Ala Trp Val Ser Val 
275 280 285 290 



gac ctg ctg ggc etc cca ata gac taacaggcag tgccagttgg acccaagaga 1024 
Asp Leu Leu Gly Leu Pro lie Asp 
310 

agactgcagc agactacaea gtacttcttg teaaaatgat tctccttcaa ggttttcaaa 1034 
acctttagca caaagagagc aaaaccttcc agccttgcet gcttggtgtc cagttaaaac 1144 
tcagtgtact gccagattcg tctaaatgtc tgtcatgtcc agatttactt tgcttctgtt 1204 
actgccagag ttactagaga tatcataaca ggataagaag acectcatat gacctgcaca 1264 
gctcattttc cttctgaaag aaactactac ctaggagaat etaagctata geagggatga 1324 
tttatgcaaa tttgaactag cttctttgtt cacaattcag ttcctcccaa ccaaceagtc 1384 
ttcacttcaa gagggccaca ctgcaacctc agc::taacat gaataacaaa gactggctca 1444 



G 
G 
C 



caa get cgt aat gag act ata gca agg egg ctg tgg gac gtc agt tgt 970 □ 
Gin Ala Arg Asn Glu Thr He Ala Arg Arg Leu Trp Asp VaL Ser Cys □ 
295 300 305 □ 

□ 



□ 
□ 
□ 
□ 
D 
□ 
□ 
G 
□ 



a 

G 
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ggagcacggc ttgcccaggc atggrggatc accggagg-z agcag^iTicaa gaccagcctg 1504 ■ 2 

gccaacacgg Lgaaacccca cctctaciaa aaan-gtrti -a-:c^"tgtg cgtcttcctc 15'54 I 

tttatgrgtc ccaagggagt atttccacaa acr.T:caaa= =■ agccacaata atcagagar.g 1524 D 

■J 

qagcaaacca gtgccatcca gtctttaLgc aaargaaa^r c'ccaaaggg aagcagattc 16B4 j 

-gLana-cgtt ggcaaccacc caccaagacc acatggg^ag cagggaagaa gtaaaaaaag 1744 

agaaggagaa taccggaaga taatgcacaa e.a-gaagcg=. ccage-aagg attaactagc 1 = 04 □ 

ccttcaagga ttaaccagcc aaggattaar agcaaaag=t atcaaacatg ctaacatagc 1354 3 

' tatggaggaa ttgagggcaa gcacccagga ctgacgaggT cttaacsaaa accagtgtgg 1924 H 

caaaaaaaaa aaaaaaaaaa aaaaaaaaaa accccaaaaa caaacaaaca aaaaaaacaa 1984 G 

□ 

ttcctcattc agaaaaatta tctti^gggac tgatatcgg- aarcatggtc aatttaataa 2044 J 

□ 

tatcttgggg carttcctta cattgtcttg acaagaczaa aatgtctgtg ccaaaatttt 2104 □ 

□ 

gtattttatt tggagacttc ttatcaaaag taatgctgcc aaaggaagtc taaggaatta 2164 
gtagtgttcc catcacttgt ttggagtgtg ctattccaaa agatnttgat crcctggaat 222 4 
gacaactata ttttaacttt ggtgggggaa agagttarag- gaccacagtc ctcacttctg 2284 
atacttgtaa attaatcttt cattgcactt gttrttgaccE t^aagctata tgt-;:agaaa 234 4 
tggtcatttt acggaaaaat cagaaaaatt ccgacai-ig cgcagaataa atgaattaac 2404 
gttttactta atttatatcg aacrgncaat gacaaataaa aattcttttt gattattttt 24 64 



□ 

□ 
□ 

n 

n 

Q 

a 
o 

G 

a 
□ 

tgttttcatt taccagaata aaaactaaga actaaaagc;: tgactacagt caaaaaaaaa 2524 0 

□ 

aaaaaaaaaa aaaa 

□ 



<210> 2 
<211> 314 
<212> PRT 

<213> Homo sapiens 
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<400> 2 

Met Phe Pro Leu Leu Leu Lau Leu Leu ?zo rhe Leu Leu Tyr Met Ala 
1 5 ' :o 15 

Ala Pro Gin -lie Arg Lys Met Leu Ser Ser Gly Val Cys Thr Ser Thr 

20 25 30 



Val Gin Leu Pre Gly Lys Val Val Val Val Tr. 
35 -iO 



r Gly Ala Asn Thr Gly 
45 



lie Gly Lys Glu Thr Ala Lys Glu leu Aia Gl.- Ary Gly Ala Arg Val 

50 55 60 

Tyr Leu Ala Cys Arg Asp Val Glu Lys Gly Glu Leu Val Aia Lys Glu 

65 -70 ■ 90 



lie Gin Thr Thr Thr Gly Asn Gin Gin Val Leu Val Arg Lys Leu Asp 
85 90 95 

Leu Ser Asp Thr Lys Ser lie Arg Ala Phe Ala Lys Gly Phe Leu Ala 

100 105 110 

Glu Glu Lys His Lau His Val Leu He Asn Asn Ala Gly Val Met Met 

115 120 ■ 125 - 



G 
□ 

□ 
□ 
□ 
□ 



Cys Pro Tyr Sen Lys Thr Ala Asp Gly Phe Glu Met His He Gly Val 
130 125 140 

■> 

Asn His Leu Gly His Phe Leu Leu Thr His Leu Leu Leu Glu Lys Leu 

145 150 1^5 160 

Lys Glu Gee Aia Pro Ser Arg He Val Asn Val Ser Sec Leu Ala His 
165 no 175 

His Leu Gly Arg He His Phe His Asn Leu Gin Gly Glu Lys Phe Tyr 
180. 135 190 

Asn Ala Gly Leu Ala Tyr Cys His Ser Lys Leu Ala Asn He Leu Phe 
195 200 205 

Thr Gin Glu Leu Ala Arg Arg Leu Lys Gly Ser Gly Val Thr Thr Tyr 
210 215 220 



Ser Val His Pro Gly Thr Val Gin Ser Glu Leu Val Arg His Ser Ser 
225 " ' 230 235 240 
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?he Met Arg Trp Met Trp Trp. Leu 
245 

Gin Gin Giy Ala Gin Thr Ser Lsu 
260 

Giu lie Leu 3er Gly Asn His Phe 
275 280 

Ser Vai Gin Ala Arg Asn Glu Thr 
290 295 

Ser Cys Asp Leu Leu Giy Leu Pro 
305 31G 



Phe Ser Phe Phe lie Lys Thr Pro 
250 255 

His Cys Ala Leu Thr Glu Giy Leu 
265 270 

Ser ksv Cys His Val Aia Trp Vai 
295 

lie Aia Arg Ary Leu Tru Asp Veii 
300 

lie Asp 



<:210> 3 
<211> 3966 
<212> DNA 
<2i3> Homo sapiens 

□ 



G 
Q 

a 



<220> 
<221> CDS 
<222> (57) . . (1535) 

<400> 3 

grcatattga acatcccaga tacctarcat tactcgaccc t:gttgataac agcaag atg 59 G 

Met 2 

1 J 

get ttg aac tea ggg tea cca cca get att gga cct tac tat gaa aac' 107 ;.. 

Ala Leu Asn Ser Gly Ser Pro Pro Aia lie Gly Pro Tyr Tyr Glu Asn 'J 
5 10 15 

Q 

cat gga tac caa ccg gaa aac ccc tat ccc gca cag ccc act gtg gtc 155 0 

His Gly Tyr Gin Pro Glu Asn Pro Tyr Pro Aia Gin Pro Thr Vai Val 0 

20 25 30 . 0 



□ 



ccc act gtc tac gag gtg cat ccg get cag tac tac ccg tec ccc gtg 203 

Pro Thr Vai Tyr Giu Vai Kis Pro Aia Gin Tyr Tyr Pro Ser Pro Val 

35 40 45 3 

ccc cag tac gcc ccg agg gtc ctg acg cag get tec aac ccc gee gtc 251 ^ 

Pro Gin Tyr Aia Pro Arg Val Leu Thr Gin Aia Ser Asn Pro Vai Val •-' 
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50 55 55 

tgc acg cag ccc aaa tec cca zcz gqg ica g-g cgc acc ::ca>aag acc 299 

Cys Thr Gin Pre lys Ser Pro 3er Giy T.-.r Val Jys T.^.r Ser Lys Thr 

70 "5 30 

aag aaa gca czc tgc ate acc zzq acc c-g ggg acc zzc etc gtg gga 347 

Lys Lys Ala Leu Cys lie Thr Lsu Thr l=u Gly .Tr.r ?he Leu Val Gly 

85 90 95 

get gcg ctg gcc get ggc eta etc tgg aag ttc ate ggc age aag tgc 395 

Ala Ala Leu Ala Ala Giy Leu Leu Trp lys ?^-e Met Gly Ser Lys Cys 

100 105 110 



Ccc aac tct ggg ata gag tgc gac tec tea get acc tgc ate aac ccc 
Ser Asn Ser Gly lie Glu Cys Asp Ser £er Gly Thr Cys He Asn Pro 
115 120 125 



443 



491 



tct aac tgg tgt gat ggc gtg tea eac tgc cec ggc ggg gag gac gag 
Ser Asn Trp Cys Asp Gly Val Ser Kis Cys Pro Gly Gly Glu Asp Glu □ 
130 135 140 14S G 



□ 

539 □ 



aat egg tgt gtt cgc etc tac gga cca aac ttc ate ctt cag atg tac 
Asn Arg Cys Val Arg Leu Tyr Gly Pro P.sn Phe He Leu Gin Met Tyr □ 
150 155 160 O 



tea tct cag agg aag tec tgg cac cct gtg tgc caa gac gac tgg aac 

Ser Ser Gin Arg Lys Ser Trp His Pro Vai Cys Glri* Asp Asp Trp Asn 

165 170 175 

gag aac tac ggg egg gcg gcc tgc agg gac atg ggc tat aag aat aat 

Glu Asn Tyr Gly Arg Ala Ala Cya Arg Asp Met Gly Tyr Lys Asn Asn 
ISO 1S5 190 

ttt tac tct age caa gga ata gtg gat gac age gga tec acc age ttt 

Phe Tyr Ser Ser Gin Giy He Val Asp Asp Ser Gly Ser Thr Ser Phe 

195 200 205 



□ 

587 □ 

n 

□ 

635 □ 

G 



633 



731 



atg aaa ctg aac aca agt gcc ggc aac gtc gat ate tat aaa aaa ctg 
Met Lys Leu Asn Thr Ser Ala Gly Asn Val Asp He Tyr Lys Lys Leu ^ 
210 215 220 225 C 



tac cac agt gat gcc tgt tct tea aaa gca gtg gtt tct tta cgc tgt 779 
Tyr His Ser Asp Ala Cys Ser Ser Lys Ala Val Val Ser Leu Arg Cys 
230 235 240 
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tta g=c cgc ggg gtc aac ctg aac zca zgz irr cag age agg ate gtg =27 • □ 
Leu Ala Cys Gly Val Asn Leu Asn Ser 5ar Ar? Gin Ser Arg He Val " 
245 250 253 J 

ggc ggr cag age gcg ctic ccg ggg gcc cc^r zzz zqq cag gtc age ccg 375 ~; 
Gly Gly Glu Ser Ala Lau Pro Giy Aia Trp -rrr Ire Gin Val Ser Leu j 
250 255 270 3 

cac gtc cag aac gtc cac gtg tgc gga ggc ate ate acc ecc gag 923 □ 
His Val Gin Asn Val His Val Cys Gly Gly Sar lis lie Thr Pro Glu J 
275' 2S0 2S5 C 

□ 

tgg ate gcg aca gcc gcc cac tgc gtg gaa aaa ect ctt aac aat cca 571 □ 
Trp He Val Thr Ala Aia His Cys Val Glu lys Pre Leu Asn Asn Pro I! 
290 295 2:Z 305 □ 

□ 

tgg cat tgg acg gca ttt geg ggg att ttg aga caa tet ttc atg ttc ■ L019 O 
Trp His Trp Thr Ala Phe Ala Gly He Leu Arg Gin Ser Phe Met Phe □ 
310 313 320 □ 

0 

tat gga gee gga tac caa gta caa aaa gtg att tct cat cca aat tat 1067 D 
Tyr Gly Ala Gly Tyr Gin Val Gin Lys Val He Ser His Pro Asn Tyr D 
325 330 335 O 

□ 

gac tee aag ace aag aac aat gac ate gcg ctg atg aag ctg cag aag 1115 □ 
Asp Ser Lys Thr Lys Asn Asn Asp He Ala Leu Met Lys Leu Gin Lys G 
340 345 *350 □ 

□ 

cct ctg act ttc aac gac eta gtg aaa cca gtg tgt ctg ccc aac cca 1163 □ 
Pro Leu Thr Phe Asn Asp Leu Val Lys Pro Val Cys Leu Pro Asn Pro □ 
355 360 ■ 3G5 O 

ggc atg atg ctg cag cca gaa cag etc cgc tgg att tec ggg tgg ggg 1211 □ 
Giy Met Met Leu Gin Pro Glu Gin Leu Cys Trp He Ser Giy Trp Gly □ 
370 375 390 335 □ 

□ 

gcc acc gag gag aaa ggg aag acc tea gaa gtg ctg aac get gee aag 1259 □ 
Ala Thr Giu Glu Lys Gly Lys Thr Ser Glu Val Lau Asn Ala Ala Lys □ 
390 395 '100 - 

□ 

gtg ctt etc att gag aca cag aga tgc aac age aga tat gtc tat gac 1307^ □ 
Val Leu Leu He Glu Thr Gin Arg Cys Asn Ser Arg Tyr Val Tyr Asp □ 
405 410 415 □ 
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aac erg ate aca cca gcc atg ate tgt: =-= c?= zzc z-z cag ggg aac 1355 ^ 
Asn Leu lie Thr Pro Ala Met lie Cys Aia Gly Phe Leu Gin Gly Asn • □ 

420 425 ^5 30 - 

ccc gat cct tec cag ggt gac age gga ggg zz- zzq etc act teg aac 1403 Z 
Vai Asp Ser Cys Gin Gly A«;p Ser Gly Giv Pro Lau Vai Thr Ser Asn I 
435 440 ^-.5 - 

aac aac ate tgg egg ctg ata ggg gat ^ea 5g= tgg ggr :cr ggc tgt 1451 

Asn Asn lie Trp Trp Leu lie Gly Asp Thr Ssr Trp Gly Ser Gly Cys I 

4 50 4 55 4 60 4G5 _ 

gcc aa^i get tac aga cca gga gtg tac egg aat etg ate eta ttc acg 14 99 □ 
Ala Lys Ala Tyr Arg Pro Gly Vai Tyr Gly Asr. vai Met Vai Phe Thr .J 
470 475 480 □ 

gac tgg ate tat cga caa atg aag gca aac eee taa tecacatggt 1545 
Asp Trp lie Tyr Arg Gin Met Lys Ala Asn Gly -* 
485 490 - 

□ 

cttcgtcctt gacgtcgttt tacaagaaaa caatgggget ggttttgctt ccecgtgcat 1605 □ 

□ 

gatttactct tagagacgat tcagaggtca cttcattttt attaaacagt gaacttgtct 1665 

ggctttggca ctctctgcca tactgtgcag gc-gcagtgg ctcccctgcc cagcctgctc 1725 

tccctaaccc cttgtccgca aggggtgatg gccggctcgt tgtgggcact ggcggtcaat 1785 

tgtggaagga agagggctgg aggctgcccc eattgagatc ttcctgctga gtcctttcca 1845 

ggggccaatt ttggatgagc atggagctgt cacttcteag ctgccggatg acttgagatg 1905 

aaaaaggaga gacatggaaa gggagacagc csggtggcac ctgcagcggc cgcectswsg 1965 

ggscacttgg tagtgtcccc agcctacctc tceacaacgg gattttgctg atgggttctt 2025 

agagccttag eagccctgga tggtggccag aaataaaggg accagccctt cacgggtggt 2085 

gacgtggrag tcacttgtaa ggggaacaga aacarttrtg ttcttatggg gtgagaatat 2145 

agacagtgcc cttggtgcga gggaagcaat tgaaaaegaa cttgccctga gcactcctgg 2205 

tgcaggtctc cacctgcaca ttgggtgggg ctcetgggag ggagactcag ccttcetcct 2265 

catcctcect gaccctgcte ctagcaccet ggagagtgca catgeccett ggtcctggca 2325 



□ 
0 
□ 
□ 
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qggcgccaag cctggcacca zgzcqqcczc ctcaggcc-g ccagccac::g gaaattgagg 2335' -I 

tccatggggg aaaccaagga zqctcagzzz aagg-acacc gcctcc;a':qt: tatgcttcta 2445 2 

cacattgcta ccccagcgcc cccggaiaci -cagcttt-sa ccnc-ccaag tagtccacct 2505 Z 

ccacttaact: c'ccgaaaci: 'gratcacc-t rgccaag-^aa gagtgg'ggc ccatttcagc 2555 . Z 

tgcc^iLgaca aaacgac::gg cLCC'cgaccc aacg:;:;c:iaT: aaaccaatgt gctgaagcaa 2625 I! 

agtgcccatg gtggcggcga agaagagaaa gacgrgc-wC gT::;:;cggact crctgcggcc 2 585 j 

ccttccaarg c-cgtgggttc ccaaccaggg gaagggtccc CEti:gcat:t.g ccaagcgcca 2745 Z 

taaccacgag cacTiactcca ccacggcuct gcc-cccggc caagcaggct ggtrtgcaag 2805 Z 

aatgaaatga atgattctac agctaggact taaccttgaa atggaaagtc ttgcaatccc 2855 U 

□ 

arttgcagga tccgtcngtg cacacgcctc ccragagagc agcattccca gggaccttgg 2925 Q 

□ 

aaacagttgg cactgtaagg tgcttgctcc ccaagacaca ' tcctaaaagg tgttgtaatg 29B5 □ 

□ 

gtgaaaacgt ctrccttctt tattgcccct tcttatttat gtgaacaact gtttgtcttt 3045 □ 

O 

ttrtgtatct tttutaaact gtaaagttca accgtgaaaa tgaatatcat gcaaataaat 3105 □ 

□ 

tatgcgartt Ctttttcaaa gtaaccactg catcetitgaa gttctgcctg gtgagtagga 3165 □ 

ccagcctcca tctccttata agggggtgat gtrgaggctg crggtcagag gacca'aaggt 3225 C 

gaggcaaggc cagactrggt gctcc^gtgg ccggcgccc* cagcc=c-gc agcctgtccr 3285 ~ 

gttggagagg tccctcaaat gaccccrtct tactatrcta ttagtctgrt tccacgctcc 3345 □ 

□ 

taataaagac atacccaaga ctgcaattta caaaagaaag aagtctat'g gacttacaat 3405 C 

G 

tccacatggc tggggtggcc tcacaarcat ggcagaaagc aaggaagagc aaatcacatc 34 55 0 
^ . ■ u 

ttacatggar ggcagcaggc agggagagag tttgtgcaca gaaactccca ttttttaaac 3525 ' □ 

□ 

catcaggtct cgtgagaccc attcactacc acaaaaacag cacaggaaag acc^accccc 353^ □ 



0 



atgattcaat tarctcctat caggtccccc ccacaacaca tgggaattac gggagctaca 3645 
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agatgagatt tggg~gggga cacagagcca aaccatacca ccc^igcccct gcacccccca 3705 Z 

aatctcgtat ctr-acacrt caaaaccaat caqqzczzcz caacagrccc ccaaaanctt 3765 ~ 

aacLcat-tc agcattaagc caaaagrcca cagrccaaag rc-cacccaa cacaaggcaa 3325 ~ 

attacr-cca cccacgagcc "gvgaaat.ca agagcaac":.^ aaiicacctcc cagacacaac 3SB5 Z 

gggggtacag gcarcgggtg aatacagcca ccccaaa^gg gagaaatcgg ccaaaataaa 3945 G 
ggggctgcag .gccccttaaa a 3966 G 

ij 

<210> 4 ^ 
<2ll> 49? !^ 
<212> PRT ^ 
<213> Homo sapiens ^ 

a 

<400> 4 2 
Met Ala Leu Asn Ser Gly Sec Tro Pro Ala He Gly Pro Tyr Tyc Glu □ 
1 5 ■ 10 15 Z 

□ 

Asn His Gly Tyr Gin Pro Glu Asn Pro Tyr rro Ala Gin Pro Thx Val □ 
20 25 30 . □ 

a 

Val Pro Thr Val Tyr Glu Val His Pro Ala Gin Tyr Tyr Pro Ser Pro □ 
35 40 45 0 

□ 

Val Pro Gin Tyr Ala Pro Arg Val Leu Thr Gin Ala Ser Asn Pro Val □ 
■ 50 • 55 ■ 60 □ 

n 

Val Cya Thr Gin Pro Lys Ser Pro Ser Gly Thr Val Cys Thr Ser Lys □ 
65 70 75 80 □ 

a 

Thr Lys Lys Ala Leu Cys He Thr Leu Thr Leu Gly Thr Phe Leu Val Q 
85 90 95 □ 

C 

Gly Ala Ala Leu Ala Ala Gly Leu Leu Trp Lys Phe Met Gly Ser Lys '-■ 
100 103 110 G 

Cys Ser Asn Sex Gly lie Glu Cys Asp Ser Ser Gly Thr Cys He Asn . ^ 

115 - 120 125 . ^ C . 

Pro Ser Asn Trp Cys Asp Gly Val Ser His Cys Pro Gly Gly Glu Asp d 
130 135 140 2 
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Giu Asn Arg Cys Val Arg Leu Tyr Gly Frc .-.zr. ?he lie Leu Gin Met 
145 150 160 

Tyr Ser Ser Gin Arg Lys Ser Trp His rro V£l Cys Gi- Asp Asp Trp 
155 175 

Asn Glu Asn Tyr Gly Arg Ala Ala Cys :-.zz Asp Mec Glv Tyc Lvs Asn 
180 ■ 1S5 190 

Asn ?he Tyr Ser Ser Gin Gly He Val Asp A=p Ser Gly Sec Thr Ser 
195 200 205 



n 

Phe Met Lys Leu Asn Thr Ser Ala Gly P.sr. '."=1 Asp He Tyr Lys Lys ' □ 

210 215 220 :j 

Leu Tyr His Ser Asp' Ala Cys Ser Ser lys Val Val Ser Leu Arg 0 

225 230 235 240 Q 

a 

Cys Leu Ala Cys Gly Val Asn Leu Asn Ser Ser Arg Gin Ser Arg He ■ □ 
245 250 255' □ 

0 

Val Gly Gly Glu Ser Ala Leu Pro Gly Ala Trp Pro Trp Gin Val Ser □ 
260 265 270 0 

□ 

Leu His Val Gin Asn Val His Val Cys Gly Gly Ser He He Thr Pro C 
275 280 285 C 

□ 

Glu Trp He Val Thr Ala Ala His Cys Val Glu Lys Pro Leu Asn Asn C 
290 295 300 □ 

□ 

Pro Trp His Trp Thr Ala Phe Ala Gly He Leu Arg Gin Ser Phe Met □ 
305 310 315 320 0 

□ 

Phe Tyr Gly Ala Gly Tyr Gin Val Gin Lys Val He Ser His Pro Asn □ 
323 330 335 □ 

□ 

Tyr Asp Ser Lys Thr Lys Asn Asn Asp He Ala Leu Het Lys Leu Gin □ 
340 345 350 C 

□ 

Lys Pro Leu Thr Phe Asn Asp Leu Val Lys Pro Val Cys Leu Pro Asn □ 
355 360 "365 □ 

□ 

Pro Gly Met Met Leu Gin Pro Glu Gin Leu Cys Trp He Ser Gly Trp u 
370 375 390 " 
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Gly Ala Thr Glu Glu Lys Giy Lys Thr S=r Gin Val Leu Asn Ala Ala ' ^ 

385 350 355 ^00 □ 

Lys Val Leu Leu lie Glu Thr Gin Arg Cys Asn Ser Arg Tyr Val Tyr C 

405 ^10 115 " 

Asp Asn Leu lie Thr Pro Ala Met lie Z/s Aia Gly Phe Leu Girt Giy Z 

420 425 430 2 

Asn Val Asp Ssr Cys Gin Gly Asp Ser Giy Giy ?ro Leu Val Thr Ser -2 

435 440 445 3 

Asn Asn Asn lie Trp Trp Leu lie Gly Asp Thr Ser Trp Gly Ser Gly Z 

450 455 450 G 

C 

Cys Ala Lys Ala Tyc Arg Pro Gly Val Tyr Gly Asn Val Met Val Phe □ 

465 470 475 480 U 

n 

Thr Asp Trp lie Tyr Arg Gin Met Lys Ala Asn Gly H 

4B5 490 D 

0 
□ 

<210> 5 ° 

<211> 2108 ° 

<212> DNA ' 0 

<213> Homo sapiens ^ 

<220> ^ 

<221> CDS - 

<222> (23) , . (199) ' C 

<400> 5 ^ 

gattiactcac acagccttga ag atg caa tgt csg eta tct agg aca gaa aca 52 □ 

Mer Gin Cys Gin Leu Phe Arg Thr Glu Thr O 

15 10 □ 

tec aag gcc gcg tea gaa etc aat tac gac cac ata tgc att aag gca 100 Q 

ser Lys Ala Val Ser Glu Leu Asn Tyr Asp Tyr lie Cya He Lys Ala □ , 

15 20 25 C 

gga act ggc agg cct cag ggt acg cca act ata gga etc gtg ctt etc 148 ^ □ 

Gly Thr Gly Arg Pro Gin Giy Thr Pro Thr He Gly Lsu Val Leu Leu □ 

30 35 40 □ 

□ 
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gua cgc egg gc" a-ca ate tat gaa act gag etc cag age cag cca ate 196 J 

Val Arg Trp Ala He He Tyr Glc Thr Glu Z.eu Gin Ser Gin Fro lie Z 

45 50 5S r; 

act tagctccte=L taacaagtct aacccgctct ggaaagcrga aagggctgca 24 9 ~ 

Thr - 

ctggaacaac acagatgaga tattctacac attaatctaC ttatctggaa tcactttgcc 309 Z 

tctaaaggcc agagaaaaat cacagcttee ttctcggacg ggaaaaggac aggtgatctg 369 Z 

gggaaaacgc agctacacct ggagcaaggt etcttcccgg cttggcaatc tcagctgtgc 429 Z 

cggcgctacg ggacccgagc cgtcccagaa accaaagcgc aggcacggea gcaaacgcct 489' -J 

gagtgctgct gcctteggtg actatatgag aatggaaact tctaaggaag ccaggttgtt 549 ^ 

agaattgtta ccccctttac tcaqagataa catagactat ccaggctgag atggaaaaca 609 G 

□ 

agccctttat tgaarrttca ar.acagactc cctgcttctc atctccttaa taaaatttca 669 J 

G 

tcaaaatccc cttgaactcc catgttcaaa tctccatttg ttgacagaca aagccaacaa 729 

tactctaaac tgaggcctgc aagtcatttc atttgtattt ttgtccagaa atttcccata 789 

ggaagacttc acctcctaca actccgaaga aaacccttac tgtccaagac cgtcaccagc 84 9 U 

O 

aaccatccgc agtcattcaa gtggaagctt tcacagcttt tgtacattct ctgcctcaat 909 □ 
atacaactga grtacagact gtcccctggc tccctgaccc ttacaaacac taaaagtttc 969 G 
gtttgaccca acttcaagct gctcatctgt tagraagtga tgttcoctcc agaacacatt 1029 ■- 

catgatgaga actttctaaa agaccagcac tgctcttccc ctcctataat cataataatc 1089 0 

□ 

atgataacct gaaacatgtt actgggactc gacatttttc tggggattga aatctttagt 1149 □ 

J 

ccttggagct gtcacatagc aggggcaacc tcacactgaa acaaaggaag cgatgtccca 1209 □ 

□ 

ttattatcca ccctgagcca ccataatatg ctgtttacst ttattttctt cagcctgtgc 1269 □ 
aaaacaaagc aatggaaaag gaaactaaaa aatatacata ctagcaccat tatcttcttt 132^ □ 
tgcctaaaat tactaatgca ccacgtcagt ctgctteett caggcatcat tctcaattca 1389 Z 



□ 
□ 
□ 
O 
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tcaggacttg tattaqcagg cnztggczag ag5c=.~t5rc -ccngtcatc acaatcaatt 1449 Z 

n 

aatgttttct ggtgarcaca ccagcccc~a cc-aagaeg. ::catgci:ai:a caagggtcac 1509 I: 

ccaaatagcr gagtgcagiic Cutgccca ta tutcci^ica": c--aaccccg caaacaagaa 1569 □ 

ttaagatgat cccaaraaaa gaaaaaccgc tcaggaaa-i gaacctti-;"" crgaaccaag 1629 j 

caci:gtcagc aaacctcagg tattagagca acnatggr-^ atrgaaaagt grctcaaaat 1689 □ 

ctgggccaag aacgatrgct: agguccacaa gctaac^rg" zrcggcctrgc cattcacgra 1749 G 

G 

aqccaaagaa agccacccot gagtaaacca tagaaaacgz T:cagacccat cctgttagca 1809 G 

igtcaaatca actaagactg gcagggtatc aacccca--c caggngTicat: ggacaaagag 1869 □ 

ccccactatt fccacagtgc cagcctccac cuaaggaaac cctagacctr ggaaccagcc 1929 G 

□ 

ccccggcagg gaaccgctga cagcttcaac gccgacag-t ggagccaarg cctcatagtg 1989 □ 

□ 

caaactgaaa gaaaaatagt tgcctttiraa aargrcagca agaaggcctg cctcatctta 2049 □ 

□ 

acaaagcaaa aaaaaatgct tcaatccaaa tcaaaaatca tgatactaaa aaaaaaaaa 2108 □ 

□ 
G 

<210> 6 ^ 
<211> 59 C 
<212> PRT ' - 

<213> Homo sapiens ■ ■- 

<4 00> 6 \ - 

^5et Gin Cys Gin Leu ?ne Arg Thr Glu T^.r Ser Lys Ala Vdl Ser Glu □ 
1 5 10 15 ■ G 

C 

Leu Asn Tyr Asp Tyr He Cys "He Lys Ala Gly Thr Gly Arg Pro Gin ' C 

20 25 " 30 □ 

C 

Gly Thr Pro Thr He Gly Leu Val Leu Leu Val Arg Trp Ala lie He C 
35 4C 4 5 C 

Tyr Glu Thr Glu Leu Gin Ser Gin Pro lie Thr C 
50- 55 . C 



<210> 7 
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<2I1> S211 - 
<212> DNA 

<213> Homo sapiens 
<220> 

<2 2i> unsure ~ 
<222> (3531 

<253> Nucleotide secuence uncertain - 
<400> 7 

ggaccaacat ttttaattct taatgtattc aazz—czzz tccaatcata cctacaacgc 60 J 



□ 

n 
a 

actcacgccq caaacgttcc aattcatctt cgtzaatctc cgaaaatata ggcgraccat 130 □ 



tagggtcaai tccgctagct gccgtcartt ct--ttg-3C tttgattgca rtagccataa 120 



cactgctcct gtttcaaatg cagccgttga cg;;accacgg gtaotcgtca ttgattcagg 900 

ttgaar.gaac acacttccaa attgtccgct ccacgctgta ctcataaatg aagaaaagtt 960 

accgacraag ctcttaacat taatattaat accattaatt tgagagaagt ctaattgacc 1020 ' 

acctttaatg aagtragcgg ttaatgttcc cctcacgata ttatcagcac ctaaattcat 1080 



□ 
□ 
□ 
0 
□ 
□ 
□ 
□ 



■aatcatcata cccaattgta tcccggacac caccagatgt cacaacgata ggtgttgatc 240 

cgtcacctgt tagcgtcgtt gatagtcg-g ccaatacatt tccaagtttg tctttgtaat 300 

cataaaccaa atttgtttct ttttgtaact ccacttttcc tccttttatg gtaataataa 360 

ctntctactt gacccaccaa acgcaacaac grtatcatga ccaaatgcaa taccaccacc 420 

ctcattcgtT: ccatatgacg ctctaatagt aggcaactta tctcccccga accagtctcc 480 

ccaagatacc cacgtgaacg ccaaatcgtt tt-agcgtca ntaatacgca taccattacc 540 

gttgaagcta attctaccgt cgtaaaagtt aaatccagaa cgtgcctcac cagcatacgt 600 '-i 

tgggttttgc cacgtaaaca agtcagtgct gcttatccat gaagccataa ttccaaaccc 560 □ 

a 

gattttatcc ccaccccaaa tttgaccagc gggagaacca att^cgttaa atgtctggtc 720 □ 

□ ■ 

ataraatatt gccattaatg agttagcagt tgtgttggfc ggactaaatc cagcgaatac 790 □ 

tccgccaaca gtttcaaaac cgctactacc atc-gcatat tcagtctg-t cgtagaacat 840 □ 

□ 



n 



Q 
0 



SUBSTTTUTE SHEET (RULE 26) 



wo 00/65067 PCTA;S00/1 0920 

17 



aacgttaatc ttagcagcgt caatcgLccc gacc-gcaac cgagaaccgc cagcagcgcc 1140" 

cacaaa-ctta acccaaaccg tcccagaata sg-^-aca-- ?cz--cccgt caggtucgtL 1200 

gttgtcactt ttgaaccaaa catcgccaac aac-g(jy;_:,=i iCtggtccsg cg-taccata 1260 

aaaaacagta tcgtcacctt g-gcaacagc -c-a'tttzt tgccatrgCu gaccagtttt 1320 

agcacccaac acccagatag taccatcaaa acaatactrc zczazB^zzcc caccaggtaa 1380 

cgctttgaac cacacgtcac ccttrttagg gii-aggccgT: -cracngcac caaa^gTitgc 1440 

actgttgtga ccaccgttag caacaci:cca ctctacttga r-gccattag tcgcaanact 1500 

tccagataat ::ca--aatct gtgacgtgat ac-g--c = -- "gauaagtLgt ctcctaacga 1560 

caattcagtc attrcaggat ttaacaagtc azzczzaacz tcaaacactc gtgtcntata 1620 J 

□ 

acccaaacca cjggtcactat gcataatcaa aact:gi:gi:ca ccaagngaca agttgccgac 1680 Z 

n 

ctcqgctact gctgccgaat actcaactag cgggtgatta atagatacga gtgtttgata 1740 □ 

agcttgttga attagcacar tagggtcttc aacztcai:ca aacgrcttca gatataatct 1800 C 

agccgaacca ttaatgtgac catatttagc cgcagcgrca ggatcttcaa gaatcttact I860 □ 

□ 

tccagcaggt ccacccaatg ggctaccact tgacttagac cacgcaacgc cactgatgtc 1920 □ 

aattctacga ccacatccgt caggagagcc gTicagaacc" gagccaacct: gttccccttt 1980 3 

cccacgi^ggc aaaanagcgg tgtaaatgct a-czgaccgc tgttcctctg cgaccgt-ag 2040 □ 

caagt-actt ccaccagcga aaacctttga cgzgtcgtca ccttgtctga atacatagtc 2100 

catgaagcga ccagaaatct rgtcgccagc aartgtgata nagaagtaaa cttcgccgcc 21S0 

cattaaacca acgacttttt ggatagcttc -aagrttgta acgtagcaga agt-tgttgt 2220 

aacactacca ccaatattaa crgtacctaa cgaccacrgc gaaccatcca acgcataagt 2290 



J 

□ 
□ 



J 



catagcttga cragcattaa cgrrgttagg gcgc-tgtct tcaatatacg atgttgacgc 2 34Q a 

□ 

aagc-cttga taagctaatt cactagctgt gcactcaata acgtrcaacg- tg-cagtgcg 2400 □ 
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cqacacaaqi: caqaaaacca sgcg-'Tiacr gccr- = = -t3 ggg^grggLg Ligacacgta 2460 2 

3 

ccccacacca ccaLiiaaacc ccctacvaa-c cggcara-ig aacgaaaacc cgacagccg- 2520 0 

attaacictta aaatcaatct gaccagccat taaaicgzt- -ctiuucagta gLccgacaai: 2580 C 

attctgc-gc tcaticaaata aacadaccat agaatacr-a ccc-gtaact aagcgtcaca 2640 I 

ccactgacat: ragnacctgr caaccnatcn ccc-c--tta c-cttgcttc aaacacattc 2700 Q 

gttttttgaa tac-aatgrc tgacaactrg czzczzzzzz cgattgtgat accgaaartt 2760 □ 

accatatcaa ccacaactaa cgcaccaqca ggacaa-iac c-accattgt cagatiigcaa 2820 Q 

ccgttgttaa tactggcaac aarcgaacgc gtigccga-a caagtctcaa agcgattctg 2380 □ 

n 

tcgactggtt gcgagtaaat taactgcggg tcacnga-aa c-gagactgt tgccgagttc 2940 □ 
accaagagga ccatcttgaa aatccccatg aatgaactga caacaatcct gaaggcctgg 3000 □ 

n 

gaccttttgt ctgaaaatca acrgcagacc gt:aaarrzc= gacagagaaa ggaatccgta 3060 G 

D 

gttcagcact tgatccatct gtgtgaggaa aagcgrgcaa gratcagtga tgctgccctg 3120 □ 

□ 

ttagacatca tttatatgca aittcatcag caccagaaag tczgggatgt tttccagatg 31B0 C 

G 

agtaaaggac caggtgaaga tg::tgacctt tttgacazga aacaatttaa aaattcgttc 3240 u 

G 

aagaaaac-c ^ccagagagc aczaaaaaat gcgacagTica gcttcagaga aactgaggag 3300 □ 
aatgcagtct ggattcgaat cgccrgggga acacagcaca caaagccaaa ccagtacaaa 3360 
cctacctacg tggcgtacta ctcccagact ccgtacgccr tcacgccctc ctccacgctg 3420 
aggcgcaata caccgcttct gggtcaggcg ctgacaattg ctragcaaaca ccaccagatt 3480 
gtgaaaatgg acczgagaag rcggtazctg gactctctna aggcrattgt ttttaaacag 3540 
taraatcaga ccrttgaaac tcacaactcr acgacacc-c tacaggaaag aagcctrgga 3 600 
cragatataa atatagafcc aaggaccatr cacgaaaaca tagtagaaaa agagagagtc 3660 
caacgaataa ctcaagaaac attcggagat tatccLcaac cacaac^aga atttgcacaa 3720 
tataagcctg aaacgaaatt caaaagcggr ttaaacggga gcaccc'ggc Tigagaggaaa 3790 



n 
n 



G 
G 
□ 
□ 
□ 
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caacccctcc gacgcctaat aaagttcrcc agcccacacc ccciggaagc atcgaaaiicc 3340' li 

tcaqcaccag cgggcattgc agargctcca c-t-r^ccac cgcc-ac-cg catacccaac 3500 ~ 

aacagaacga atLacCttaa eacTiagagac aaa-2ogacg -g=g-ggttt ctcaagcaca 3960 ~ 

gcccciicct:: ctcgaractg cacacgcact: tczzzzcazg gc-iiagc-cgLa tagct^ccgt 402C Z 

ctgtaaactt gtatLHttcaa gaaiicc-Tigg ta^cgaa-.t -ragaaatgc ccacataat' ^OSO 

gtrgggactg at^cattcct ccacgatatg cczcczczzz ccga-atcct gccaaccgta 4140 ~ 

gccgttgtgg ca-ttgagat gacaggacat az5.z = ZA-=z ggccccacac ttgaccrtga 420C I 

gtgcctgaac gcic-cgaaai: caagcaratg gcacagccc: caagactrtt ggg-etgrgi: 4260 

ccttttttcr atggccgtct ctcctcaatt CT:ggagagct: crggncccag tggctggrtt: 4320 Z 

G 

ccagggattg attcttaagc tctggatcac agagagasgc aacaaggaac tatactcaac 43flO □ 

Q 

ccaaaacttt ttaggagaat catgaaattg gtctattcaa aggacggagt tgagtccatu 4440 □ 

G 

ctgt:tattgt tgcaagaggt tgcatacttg gtgagtcagt tatacaaaat agtgttctta 4500 0 

G 

ctgtaaatac gatacttctc anaatctact ctaccatgng tacaacactc aaactgacaa 4560 C • 

D 

atatactgac ttargaataa aggtgtcaaa aaactggcac accagtcaat tttgatcaaa 4 620 C 

gtacttcagt gatcatcact aaatacccca cciiCutcaaa aatcztctcc tttccaattc 4680 ~- 

cttatttctt ca^-ctattta ttgagacggg gzczcqczqz gtcaccccag cccgggcgac 4740 Z 

agagtgagac tccgrcttaa aaaataaata aaiiaaaacaa aataaaiigac atcacttcgg 4800 u 

utcagagctc t:aaaa::ggag ggaggaagcc attctaaaaa ggactcccta catgacctgc 4860 □ 

aacttgaaaa aaaattaaaa gctccaaaaa aaaaacaana caggagctT;a cct:tgaacct 4920 □ 

□ 

ttgaattggg ccaaactgcg atgaccactg catcctggaa aatrta'attti caccagcact 4980 IZ 
■acaactccuc aacagcacca accaataaac tacggaccti- tgnacxaacc cagttgcctc 504Q C 
tttcaaaaca acttgtcaac ctgtccaatc accCwCaccc t'ttrraaaa acccczcctc 5100 □ 



SUBSTITUTE SHEET (RULE 26) 



wo 00/65067 PCT/USOO/10920 

20 

-accrtctct cttcagaaca caactggcT:t ctacrccasa- zzgzczccca aattgcaact 515Q 

cc-aagacc" caaraaaaac accctgtccn rgc-aaa = =.= aaaaaaaaaa aaaaaaa 5217 



<:210> 8 

<211> 12020 - 
<212> DNA - 
<213> Homo sapiens 

<220> ' - 
<221> ni5c_feature 

<222> (2,246) - 
<223> Androgen response element 

<220> - 
<221> ini5C_feature 

<222> (2,246). ^ 
<223> Androgen response element 

<22p> ^ 
<221> misc feature ^ 



<222> (2,175] ^ 
<223> Progesterone responsive element ^ 



<2Z0> 

<221> niisc_f eature 
<222> (2,627) 

<223> ProgeSTiercne responsive element 
<400> 8 

aacacatttg ttcaaatatt atctacggct gcttccacat ttacaacagc ' agagtcgagt 60 
agctgcaaca gagaccaaat cctcctcaga gcataaaata tttatttttc ttttcaggag 120 
aagtataaca gctcctatac tatccagtcc ctcattattc caatc=catt atagttaata 180 
acttttttat gttaaactct ccctgttcaa attgttgtgt agtttctcac ttctgattga 240 
accctgtczg atacatctcc aacctctctg gc=gt:itctc agtcaatctt tgggctcccc 300 
tccctctacc catttcataa tgttagggtt ccrcaaagct cccctctgcr tttagatagg 360 ^ 
tttttgctct gccacccagg gtggagtgca gtggcacaat tctagctcac tgcagcctcc 420 
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aaccccLgac Lca<i<icaacc ctc = cacc-.c accc-cccaa agagctggga crar.agacg- 480 I 

gtaccaccat qcctagctaa ttttctagca gagacaanca argg^ct-ea atctagactc 540 I 

cattatggct ccaatg^gtc tataaaccac ccgaaa-.-a- acgcaaartr tatatggaca 60C 3 

tgcaaataca cacaag-atg gggagaaacc cccag"-"a tcrtr-ggca a.atagctca 550 ^ 

acagtraagg gcacaacaga ccctgcatac aac-gg^gg ^caaaactga crgcatctgg 720 ^ 

gacgctaarc tgcaatgccg actccaatgt tgac-gcaa- g^cgatcagc cccag^cctg 730 I 

tgaatatctc ctgarcccta ctttatttac tgtccc-ag- g-aagaaca^ grcaaccttg 840 □ 

atgrrattgc acaaaT:tatt atcca-.taca cargrag-.'.= -cecgcctgc iccggagqqc 900 2 

raccrttaat tgzcrtgcac agagcag^-" gacLC--".-" ctacagtaca taaac'.ctga 960 3 

gtgtggagct tacatgtaca gaaatctgcc acccaagccc acatttctgt ctgtaaattc 1020 G 

ccacagtaaa acaccctrac caagaaactg gatttgtcrg ccccaacctt tggtgtcttg 1080 C 

□ 

gctcctctgg catttggggg ccactttgca ta-agggccc cctcatggaa caccacccgc 1140 C 

ttcagtttga atcccagccc tgcctctttg acc-aggcta gcttacacgg tgccccagtL 1200 □ 

ttcttgtctg taaaataggg aaaactatag taccnacccc ttagagttgt tatgggaacc 1260 3 

aagagttaat atacatatat aaaccatitgg gacaggccc- qgccataact aagcagtatc 1320 - 

taagcactgg ctaccactgc catttctggc acrcrcccic cLCCCtagac ggttttctgt 1380 □ 

tgctctagtg caccctgctt tctctgaccg tgaaccc-ca ggtggggtct ttcctctgct 1440 □ 

cagcacactc ttgrccatg^ tgcttgc=ta ctaactcccc -.ctggcctcc gctcaaatgt 1500 3 

tatttcctca aaraagcctc ccctgatcca gaacrcc^ct cccttgtaaa agc-cccaga 1560 3 

gctttctcta arcctcctca taaccttcac catgacttat ttacagtgat tacttattta 1620 □ 

agg-cttcrc tgacagtcca tgaactccgg gcgagcagag attaugrgtt cr.tiggcrtcc 1680, 3 

\ -J 

tgctggacat cagcacagtg cctagtacac tacaggaa-g ccataaatgt ttgccaaatg 1740 □ 

_i 

cgtgaacgac tgaataagga acaaaacaaa tgaaaggctt ttacaaagtt ataaaaccca 1800 □ 
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atacaaatat taggcaacac aataatgaga atLCLcacaaa argrgatgat tgctaggtaa 1860 Z 

tccaa-grta aaLz.ggtagg aatcag-att gtg-^aagaa aaccctctct cctccaattt 1920 I 

gctataaaat aca-liagccc a::-:;ggaaccc y luili^l t:cz= accaaatgga aaccargcaa 1980 Z 

aaggaccaga aaatattact gcqcacaatt aaqgaawcac agacagatac tcscrgguta 2040 .1 

aaacaccccc cactttcacc cccaggaaat: aggcc'caja -gaaagaaac cac-gaacag 2100 Z 

ggaacccatt ta-^cacaga gaacai;aLtc atggccctaa ^iccaagacag gafitaaacac 2160 

tttcacgagt ccgcaaaacc aaccgrgcca atCwgagaca --^cattgcta ccccgggagt 2220 

-aatgratgt gacccaagcc atcaggaaga cgatair-ta ::-di:caatga acacrcaaac 22B0 

acragaaaaa tcacgatagt. tgccaccata arccaagaca gacacaataa agtgattctc 2340 C 

C 

ctgagaatgt cctcaaggac ttattgctLC ctcttcrc^g gaaaaaggaa tcctcgagga 2400 - 

C 

ggaagggttt acattccccc accttcccaa c::gt.cagacL ccggaccgct cctczcczgz 2460 □ 

C 

ctaatccgcc tttcatttcc cagacgccaa catctgatcc ccaggagaaa cagtgcggta 2520 C 

actgcctaac caggattgcg gaaaactgaa ccaggcactg gacctgggtc tggggagara 2590 CI 

gaggactgca gagatttcac aaacaccgcg qtqcczacza cactaeagga araagcgtac 2640 -2 

tatrcccata gugcatacnt tttttttrcc agaaaaaccc cCctggggta tgagcaccat 2700 

cttcagttct gg-tcaacrt ttcattgaga ttctatgcgg ataagaggat cttgcagcga 2760 ~ 

aaaaaagata aaaagcagrg ctttcatctg agtgtgagaa accatgtgtg cc-cagaggi:g 2820 

ggagaactga gtacLcagaa ggcagaaaaa acaactgaga gctgggtgaa tgatcccaca 2S80 3 

n 

ggccctaagc cccagatggt tgggga-tag cgatcca-aa ttcaactcct gccagcgttg 2940 G 

gcggcagcgc tcccacgtag ccagcccgag gctgtcaagc tctggactac attccccaga 3000 

aggt:i^:gctc cccgggctgt cgcaattrga attggggcgT: gtcT:agaaag agaagccata 306Q G 

gtcggcgagc aacgctggag catcccgctc tggtgccgc" gcagccggca gagatggttg 3120 u 
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agcrcatgci: cccgczgzzg czcczoczzc t:c = c--tc = - tczccatacg gc-gcgcccc 3180 I 

aaatcagg-c tgtgcadatg cattgccttc tct-gsgaa:: tczcgntgcg gcgcgggagg 3240 I 

gaaacataga atccccacca aagctfccTzt g?a=a£c-;c cc-ggggggg ccccaagatg 3300 Z 

gcggagccca gag-'cgg^g gg;:t:ggaggc gagg-zcagc -gtcctgggt gccaggagga 3360 Z 

cacccgagac cccaggcggt gagaaagggc gccag--cgg ggcacaccgc ctctccctgg 3420 ■ Z 

cctgctctitc tcttgtctaa gcaccccc-g atcctggc-c cagggaarcc cgaggg-aag 3430 Z 

gctaccLcaa acgcttgagt: zzcgg&czcz tcaacgggaz atggcccgac gctgtt^-cc 3540 " 

ccrtat::ccc cacgccttaa accagcctcc gagg-acaaa ggagcgattc cacactccgc 3600 I 

cctctgaaac tgatagcaat cgtaacaccc gcggccccac :ccrgctctc tcgctgcrgt 3660 I 

tacaatcgct tgttctgtgg gcggatgtcg taaaggagaa aagtgaaaag gaagagtgtt 3720 □ 

tggttatgag ctctatcctg ccttgcgagc aggagcacgg gggatctgag ttaggagcta 3790 □ 

gggtactggg gcggtgaggg ccgggagaat ggtgggaagg cattgaccag gtgatagatc 3840 □ 

□ 

aatcagccca aggctggcca agcgccccag tgaaccagca gcgggctgra aaaccggcag 3900 D 

C 

cctgcttatc ggttccaagt aaatcaccag tggtgacgga ccagtgactg aatagacaga 3960 O 

cacactcctt ccccagccgg gcggtCLTiag tcccaacaga acctgaagcc aggccagatc 4020 G 

n 

acagtcattt cttaccacag tcacctactt Lcccccct::^ afctttattt tctcttcctt 4080 □ 

ctccggtgcc cctcttccag ttatattcti tctgcccic- gaacactcag ttctcccacc 4140 C 

cccaggcaca catggtrtcc cacacrcaga tgaaaacac' gcctcttacc ttrttrtcac 4200 □ 

tacaagaicc cctuatccac aiiagaatctc aacgaaaagt tgaaccagaa c-gaaaatag 42 60 ^ 

tgctcatacc ccagagggat t,i:ttct.ggaa aaaaaaaaaa tatatgtgcg Tiatatatata 4320 

tatatatgtg tgtgtgtgta tstacacata cg^qngrgtg tgtgtatata taratatata 4 3 90 

natattccag gagctgtatt cccaacccag cagggLtatg ggagtcgggc cagagctgga 4440 ' G 

crtLtatagt t'gtct=t:ttc attccrgctg gtcaaggaga agaaagaact aagtacaggr 4 5QO 
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ccnactaagc tgtcactcct ggagcuacac acac-aagg? gacgcgggaa acagccraga 4 560 ' 3 

ggcctgc-ct c-gaggaggg dggtga-ggt t;:g-gggga= cacgagtcaa cCTigcccccc 4 620 

r.a-cagcatt ttcgcacta:: gcaggcacc:: cca=.iggc-a sgagtatgta ccrtgcaatt 4 680 I 

cc.^agr.r.caa ccaaaaaggc ugaaacaciic ci:caagaacg catc-cctgg tcaacagatc 47 40 J 

gci^^aacca gagtcccaac attgccaaca agacaa'agg ccagnc-gag aaccaccaat 4800 I 

ctccgctcgt gcattcat-c atttggaaaa ggcaa'atag ratagtacti: agaaacaagc 4860 

actctaggc' gggcttggcg actcacaccc -gtaatcccag cactttggga ggctgaggtg 4920 I 

ggcggaccac ccgaggrcag gagttcgaga c=a-==-ggc cagcguggtg aaaccctgtc 4 990 I 

tctactaaaa atacaaaaat cagcrgggcg tgg-ggcgca tgggtgaggg ggtgagggat 5040 3 

□ 

gagaatcgct ugaacctggg agatgggggt tgc^grggag gtggaggttg cagtgagccg 5100 □ 

□ 

agatcacacc actgcactcc agccrgggca acaagagcaa aactctgtct caaaaaaaaa 5160 G 

□ 

aaaaagagaa agagaaagaa agaaacaagg acrc-aatta cacaaagcaa gactrgaatc 5220 O 

□ 

ccagctcccc tttgtaca::a cataatcttc tccr-cgctt- aagtgtgtat gtgcatgcgt 5280 □ 

H 

gtgggtgtgt tttttttCtc ttgtttattc cgtcccagag tcacaagctt ccataatttt 5340 O 

J 

aactttt^gt attggattag ccattcctca ■ aggccataaa atgccagttg tggacagatt 5400 J 
gacctacaga aactagttat trctrciitct gggc^gaatc ttgrgatrrc cLaagtcaca 54 60 I 
agtgcct-ag agagctcaaa ccaaagccag gtgcctggcn aagcaggtat tcatccagrc 5520 -i 

a 

cttgtgctgc agagaacagc tgtaagcctt gcccrrgtaa tagaaccact . tttgtcttrg 5580 3 



agcaaaagca gaaaactttt gtcarcrttc ctzcaacttt ccttaaatgg acactgtttc 5640 
cttaaaagga caacagtgga aagaactaaa gcaaacc-ca gccagaagac acagggagti: 5700 
gcaggaacct acuggctgag acagtagaa- gaagaaccga tgagctctcc itctctgccc 57 60^ 
gcaggaaaat gcx-gtccagt: ggggcgTigca carcaacCgT: tcagcttcc:: gggaaagtag 5320 



3 
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ctgcggncac aggagcf-aar acaggtatcg ggaaggagsc agccaaagag ccggctcaqa 5880 Z 

gaggcaag-t cacctcctct caantcccuc ccat-scirr igmttaccc ctcacagtaa 5940 I 

tgatgatcca i.-CT:cattgt cccctcccag i:t--gga = zz caagaacacc taaggacrcc 6000 3 

actgatttgc -gatcrgaaa ctagtacrat gctc-aca-s zaa-cggaccc ccaai:gaata 6060 I- 

ttattaactt a-ttttcttt acctactigc taaccagaia cctctgrtg- gaagaaaggg 5120 Z 

caaaaaLgcg cg-tttttaa aaaaatct'g gaaataacai -agatttaia acgccagccc 6180 I- 

aaaaacagag cngaccatixit crctttcacd LyLLugc-g= caggagctcg agtacatirra 6240 I 

■J 

gcntgccggg acctggaaaa gggggaaccg g-cgcccaaac agatccagac cacgacaggg 6300 □ 
aaccagcagg tgctggugcg gaaactggac cTigcccgaLa ccaagtccat tcgagctttt '6360 

□ 

gctaagggct tcttagccgg caagtgtaga actagagaga ccacgr-grg ggccttgagg 6420 □ 

□ 

tcaacacggc agattcaact aaaaggaaaa ggtccc-ctc cctcatagaa gcggtggaca 5480 □ 

0 

aaatatagcc aaaaacttcc gtagctgaac tcatag-ccac tgggacacaa tttatgtcaa 6540 □ 

a 

agggatatct tccagaggca" gaaacaaaaa atgaaT:-caa aacgtcauag cagtgagtaa 6600 U 

□ 

gagctgaagc cagaaatctc atgggagcct: cagcactcag cacaggct'iiT: ttagggtctg 6S60 H 

□ 

gggtctaaat ac--acacag ggataggaga tgaggccaca ggcccccgag agctggagct 6720 □ 
gtarttgagg ttccacacaa agttgagatt cacaaagggc catctggtcc atagtataga 6780 
agctaataaa aatcctggcc accagctcag agacacgaca aagaagttca ccatcCcttt 6340 
agacacagat ggatacctac ctgcaaaaaa rcagaacccg agctggcatc atgrgtgggg 6900 
tctgaattcg taccaccaaa tgatatgagg atatunatgc gtttgattti; tgtatgtgtg 6950 
tgtaatactc tatggaaaat caacgtaaaa acctgtagcc rctcatcaaa agcaaaccca 7020 
agagcacccc ataggcar.at-. ccccctatag cccagggatc caggcctcca aqgaaaaagc 7080 
Lt-cacnggt gcgaactcat agccanaaat tacaaaccac acgaggaaat gaaccatgag 714 0 
agagaattgg aagacgcaat aaagaggrga agccagza^:: caetrcrttg gggacctgaa 7200 



□ 
n 
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tcccagggca aaggiiacata gagaacarcc tt-zzzzczzz cgccttggaa LCH'tccatt: 7260^ _ 

cagcccggga agcctC'Lua ccgagagagg cgga-igc= = c aa^'cccaaa agcagtgagt 7320 3 

aagtagctcc cgaactacaa ggatccggaa gg-ca--irg cauLdUdddd cggaattctg 7390 

ccccaragta aaagcaggcg ggattcaacn aaac-c--=a aacgaaaaac aatagaaaat l^iAQ Z 

aaata-gaag gaaaaatgga gggaagatcg aag-::c-gca ecctca-titir caaagcctga 7 500 -I 

gcgtgaggca gcttcctaag argtaa^iagc ct-ggccagc cacccccacg ccccatcgtc 7560 ~ 

ccgcccctgc agaggaaaac cacctccacg ttt-garcaa caatgcacga gtgatgatgt 7 620 Z 

gtccgtactc gaagacagca gacggcnttg agacgcaca. aggagtcaac cacttgggta 7 680 

agaaatctgg ccttatcaca aagctagaga gacccaaaat rcLCCttggg aaactiiaggg 7740 Z 

gtcagattga ygcCLcttgt tgtgrgattg aacctcagga ccataataga ccatgtgtgc 73 

ataatagttt ggiicctacct gacagctgag gaataccLga gggctgctcc aggttctgcc 79 

cgctttttaa cctgtgattt agatttggtt tgccttcgcg cattttttaa cagactaatt 7 

ggtttagggg actgtgctgc tttcctagta gtgacccc-g cggccagcat caggtatagg 7980 

tcatctgctc tttttttgta aagacagggt ccrgccaLg- tgcccaggct gtacttaaac 80 

tcccggcctc aggcaccttc ccgcctcagc ctc^rtaagi: gcrgagatta ccggcgtgag 8100 

ccaccgcacc cagccaacca gctcttaaga atgagai.caa ccaaagtagg aatgagctgg ai50 

gaccccatat tcacatagtc -agacctatc ccgc-cacat aaggatgcct catgtctttg 8220 

gctgttgtta tttccctaaa ccttcattaa actrgttctc zagttgataa ngaaaagttg 6280 

gagggaaaga ccttatttct gaattarctg cc'cccacag gcgaggcctg ctcaccacga 8340 

ctcctgtcat tccracatgg cccrgaacct cat-aatrci: agratttctc aacaggtcac 3400 

t-rcctcctaa cccatctgct gctagagaaa ccaaaggaac cagccccatc aaggatagta 9 4 60 

aargtgcctt cccccgcaca tcacc-ggga aggatccac- tccaraacct gcagggcgag 8520 



00 



60 



□ 

920 □ 



0 

40 - G 
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aaatcctacfl Atgcaggcc' ggcciiaciai cs-zazcaacc tacccaacac ccTicttcacc 8580 I 

caggaactcg cccqgagact saaaggtggg cc-agacgaa atgaatgtgg agacacgagc 86'5 0 Z 

cagagagagg cagg;:aaaag ccacaggaat ggaaaccgc- -ccacagccc atcacccgga 3700 2 

gaggacctgc ccccg'ggc- acaaTiigggt acag::--tag caac-ag'cr acaactgaat 5750 

artggaggtg gctggctggg acagtgagac tctgac-ccg =gic-gaagg cctggcctca S520 ^ 

ccacrtacta gctJTLCtgat ctatagcaag " tacritaact: ----c----t: ttgagacggg 2S90 

iZczzqczcz gccacccdcg ctggagiigca arggrgcaat CTicggctcac rgccgacCTic 3 94G 3 

cacttcccag gLtcaagcga -tctcctgcc tcagcctcct gagragctgg gactacaggc 9000 3 

gcgtgccacc aggcctggcn aatttttgta ct-T:cagrag agacaggatt ccaccatgrc 9O60 

ggtcagggtg gtcttgaact cgtgaccrca ggtgarccac ccaccctggc cccccagagt 9120 □ 

□ 

gcrgggattt taggrgtgag ccaccgcacc tggccaccac ctaacttcca ggacttcctt 9180 □ 

□ 

cctcaratgt aaatgggact acttcacagt ttattgtcat aattaattca ggcc^cataa 9240 0 

O 



□ 
□ 



gcaagiicaag attttttacu cacagtgaga ctgagtatga gggaaggacc tgaactggca 9300 

ctagataaag gcagtagacC aaaacaggta ctgggaatat ggaggggcag ggtctccctt 9360 C 

□ 

gcaaagcatc aaacaagagg gaaataT:aT;g ctcgagaagg ggaarttcca gtcacaaaca 9420 -j 
cacagaggtc cctagcactg cctactctgg atcccraata aagagaaggg aaatcgaggg 94 90 □ 

tgaaccgtgc taggaattga ggcaggcaga acgcaggcrc aaggaancag gcacaggtgc 954 0 D 

□ 

cctacagttg gaataggatc atcctaccaa acgaacatct gtgagatggg aaggaaaccg 9600 □ 

girggttagct cagtccr-at gcccagtctg tgcccagggc atgacatccu ttatcagctt 9660 -) 

G 

aggatagata agggtaaccg gacggtgaaa actacccdaa gagttgatca gggagcragg 9720 -J 
gtggcctgga ggr.ggaaaga -gggtccagt ggrgagaact cacraataca rgaacgaggc 9780 u 
gactctgaca grtgrgagga t-cagtggaa aaccactacg gxigcagtaaa agaaccctgg 9340 ■ G 
ittgacagaa gattcattca tccagccatc cagcagacac gta-rcag-g cccatgttgr 9900 
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cccgi^gcact: gggtitagggg acaggggacg aattggtcsa caagacaggt atagtctctr 9960- Z 

cttcggcttg gct^acatcc aacgggggag aaagaccit. agcaagtctt tatacaaacr 10020 Z 

actaaacgra aaatcagtct agtaangcat caacaaccz. can-^g-caaaa agctcaaaiza 10080 Z 

aggtcagcca cagtggcrca cacc"T:taar cccaggcaci -raggaggcc aaaaccggag 10140 Z 

gaccacctga ggccagttca agatcagcct gggcaaa = -:= g-gagacctc atctcctcta 10200 Z 

ccaaaaaaaa aaaaaaaaga gttcaaagca tatggaag-= rgragaatta agctgaaaaa IQ260 Z 

ccaccttcac ttattcccct cnzcczcczq czctzczzzz t-catacgac cactatcaac 10320 Z 

agcttgtgtg cccct-tcca atiaatctticc tacgca-c-.a cacargratg taccargcac 10380 Z 

aaggaaccga aactccaaca cagtatctcg gagatctt^ic cattcagtac acatggartt 10440 i2 

accccttaat tgtattggcr gtgttgtatc ccgtagtaTia gatgtgcctt agtgta'tta 10500 □ 

n 

tgtatttcct taacaacarc tggattgttt cczzztzgzz gcggctgttt aatcairaatt 10560 C 

' □ 

gtgataaatg ttataaagga gaagtaccag gacttacaac ggattttagt gggagtcttg 10620 O 

3 

gggggacctt attcagtctg agggttcagg gaagactCuC ctgaagaagt gagccgcacg 10680 □ 

□ 

ctgagccccg aaggatgatt agaaattaaa tggactagag gtaCi:gatag aatgcggagg 10740 



n 



ctgaggccgg gcgcggtggc tcacacctgt aatcccagca ccttgggagg ccgaggctgg 10800 J 

□ 

t.ggaccaccc gaggtcagga gttcgagaac acccLggc:^a actcggtgaa ■ agccctgrct 10860 □ 

□ 

ctactaaaaa tacaaaaatt agccaggtgt ggtggcgggc gcctgcaatc ctagctacrg 10920 □ 

□ 

gagaggctga ggcaggagaa tcgcctgaac cr.gggacaga gtctgcagtg agccaagatc 10980 □ 
atgccactgc actccagcct gggcgccacg agtaaaac-c tgtccgaaaa aatgaaaaaa 11040 □ 
aaaaagggcc ggacgtggrg gctcatgctt gcaatcccag caatttggga ggctgaggrg 11100 Z 
ggcagaicac crgaggtcag gagttcaaga ccagcctcac caacatggag aaaccccatc 11150 □ 
tctactaaaa aracaaaac- agccgggcac gtnggcgiigL gcctgtagtc ccagczqczc 11220 Z 
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aggaggctga ggcaggagaa crgctcnaac cr.ggcagcca caggc-ct^g tgaqcrqaga 11280 
CcaLaccaci: gcaccccagc cxigggcaaac aaga?cg=a= c.ccacc.cd aaaaaaaaaa 11340 3 
aaaaaaaaag gaacatggag gctcagaaac aZT.czazzzt caacgagcag aatgcacaea 11400 □ 
ggtccgggcn :2gt-tgaga aacagcgaag gtza-csz-; c^agg^^-g-.a gaaagcaaga 114 50 
gcaataaaga tctcagctga gactggcgag g-aggz£g=t rccaaaccc" eagaggccac 1152C 
agcaaggatt ttagrgztita ttctaagagc aa-gggaaa= ggctaaaati: aaagcct'tt 11580 
caaaatgggg gcaagagggg -tgcagcagi: cca----ca--- agc^cagg-g agaaatgagg 11640 J 
gtagtttgga cr.aggtggEg gaartggagg g^acraqaii c^LCCgagag agatttggtt 11700 ~ 
tgggaaagcc tgttgacugg agatgcatgc -tragg-ggg cattcaccag aaccctrrtg 11760 0 

a 

cctgattggg caigtcggga acactgcaag aaccaccztt ggcTittaaaa gtagaaatca 11820 □ 

Q 

atttgggacc cccgagtctg cattacratg ggtcatccag ggagatgtcg agcaggcagt 11880 □ 

□ 

tgggaataat acagctgrgc ctgcagctgt cagxiccacaa agaccagaaa aacrcrgttt 11940 
cttcaaagct ttacaccaaa actcatttigg cagcaaaaca tT-cctgtac tagggagatt 12000 



agattaatag ycagtgtatg 



<210> 9 

<211> 869 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 
<222> (576) . . (590) 
<223> Androgen response element 



<400> 9 

aaaaagtcaa ccgrgagctg tatttggtgc r.gagccctga accacccgcc qgccagtggg 60 

tggccactgg cagtgcccgt ggacgttcrc ggcgaaagac ggcc^gactg cacctgctct 120 

ccctcagttc ciiagtacigT: gaggcac-cr ttzzccaczc tcccacaacc cccagctcaa 180 



O 
□ 



C 

12020 C 



3 
□ 

n 

□ 
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cacaactgcg gcgacngc-- iigtccg-- ccc5,-gagga ccctcacatt "caacaaggc 240 " C 

^cagtaaccc acacaccgac actcccaggg agccgzg*"? ccagcaac-g gcccaggeci; 300 ~s 

--ccctcctg cgcccatctt gcccttaarc tc'gctgaca -cgatgggtt tagctacctt 360 ~ 

g-cgtgtrtc cccagugt-- tangccacac agnc-cc;-- gagcaga-gt ncrgtacagg 420 ~- 

ggaratcaca gc'rcctgci: gagaacnggg traag-ccag caggacgcc- gggcctagrg 4 90 Z 

gcactggagt tgccggtggc cttgagcccc aggcagtc-- ccgagagccc ttcccgatgg 540 2 

atgcttgagt cactgagtgn gttcctccc- cttrccggtc aggcEctccc ggccgctgca 500 □ 

ctT:acaattg caccccancc ccctacccac icrca-cccca gagcagagtg cctctgctgt 6S0 □ 

aacctctaag gcacaagtgg gtcccaaagt ccttaaatat ggagggatgt ggggaagcag 720 0 

□ 

tggcgaagat gcaggccaaa ggccagcagg caacacrggt ctccguugca ggtaaccucc 780 G 

□ 

atctagagat tcttctagtt ctttccagat ttaccttcta aaaactaact ggtatggaaa 840 0 

G 

tattacagtc ctgtaattct ttcttctag 869 □ 

D 
G 

<210> 10 O 
<211> 2172 * P 

<212> DN'A □ 
<213> Homo sapiens 2 

G 

<400> 10 G 
ttgatgtccc caagtagrcc accttcattt aaccctttga aaccg-catca tctctgccaa 60 □ 

gtaagagtgg tggcctattt cagctgcttt gacaaaacga ctggc'cctg acctaacgtt 120 □ 

ctataaatga a-cg-gctgaa gcaaagtgcc cacggtggcg gcgaagaaga gaaagatgtg 180 3 

a 

ttttgttttg gacccTictgt ggtcccttcc aatgctgrgg gcctccaacc aggggaaggg 240 
tcccrrttgc attgccaagt gccataacca ccagcaccac cctaccatgg ttctgcctcc 300 
tggccaagca ggctggtttg caagaatgaa atgaatga't ctacagctag gactraaccr 360 
cgaaatggaa agt-cttgcaa cccca-ttgc agga-ccg*c -gtgcacatg cctcrgtaga 420 



n 
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gagcagcact cccagggacc trggaaacag ccc^cactr- sacg-gctrg c^ccccaaga 480 : 

cacaT:c;ccaa aagg-grtgr aarggtgaaa acgir-'crt zzzzzazzgc cccttcccac 540 1 

ttangtigaac aac-gr-.gt c-L^:t---c- a'-zzzzzza -aczgtaaag ctcaattgcg cOO 

aaadtgaata tcacgcaaar aaattatccq ac----t---. zaaaataacc acrgcatcr.r SSO 

tgaagT:T:cT:g cctggtgagt aggaccagcc zccazzzczz ra-aaggggg tgatgc-cgac 72C I 

gctgctggtc agaggaccaa aggngaggca aggc-agact zcczqczccz gtiggrtiggiig 73C 2 

cccLcagtrc crgcagcccc tcctgttigga cagc-ccczc aaacgacrcc ttcttattac S40 

cccat-agtc tgrntccatg cccctaataa acaca-accc aagaccgcaa ttcacaaaag 9G0 ~ 

aaagaagtt- atrggatirta caatcccaca nggcuggggt ggcctcacaa Ticatggcaga 960 C 

n 

aagcaaggaa gagcaaatca catct^acat gganggcagc aggcagggag agagtttgtg 1020 =' 

n 

cacagaaact cccatttttt aaac:catcag gtcttgrgag acccattcac tatcacaaaa 1080 C 

Q 

acagcacagg aaagacccac ccccatgatt caattatc'c ctatcaggtc cctcccacaa 1140 Q 
cacatgggaa ttatgggagc tacaaganga gattrggcrg gggacacaga gccaaaccat 1200 I! 
atcactctgc cccrgcaccc cccaaatccc gtatcrtrac atttcaaaac caatcaggtc 1260 
t-cccaacag tcccccaaaa ^cttaaciica ttrcagcacc aagucaaaag tc-acagtcc 1320 
aaagLci:cac ccaacacaag gcaagcrgcc -ccacctatg agcctgTigaa accaagagca 13B0 
agttaactac rtcctagaca caacgggggt acaggcattg gg-gaataca gccac-cccaa 1440 
atgggagaaa ttggccaaaa taaaggggct gcaggcccct taaaagtcca aaatccagca 1500 
gggcagtcaa arcttaaagc accaaaatga cctcctctga cuccacctgt cata-ccagg 1560 
tcacactgat gcaagaggtg ggtccccatg gncrggggca gctccacczc cgtgacctcg 1620 
caaggtatag cctccttcct ggctgcatca tgggc^ggrg tcgagggrct gcagc::rt:-c ISaa 
caggcacatg gtgcaagcrg ttggtggatc Laccac^c-g gg.gtcrggag aatgcTrggcc 1740 
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CZC-ZCZZ2.C a-^ctccacza ggcacT:3ccc ca = - = ^gc=r cgrgcagggg ctccaacccc 1300 

icacrtcccr tctgcac-cc: ccnagcagag g--r-cc=-:= agggcc::cac cc^gcagcaa 1360 

acttc-gcct: gggcatcca:: gcgt^ticcat ica.izzzz-z -aaacctacgc agaagtcccc 1920 

aaatcrcaat cctrgaccrc -gtccactgg C5g==tcair accacacgga agc^gccaag 1930 

gcctaaggcc tgcaaccc^t gaagccacag crr=3gczr- atcttggccc c^r^cagcca 2040 

tggctggagt ggctaggacg cagggcaiica ac-rzcti=g ctgcacacag cacaaggacc 2100 

crgggcc^gg ccca^gaaac cactrrttct tcc.aggc-- ctgggcccgc caegggaggg 2160 

gcttctgtga aa 2172 



C210> 11 
<211> 2385 

<212> DNA ^ 

<2i3> Homo sapiens ^ 

Q 
□ 
Q 
□ 
n 



□ 



<220> 

<221> iai3C_feature 
<222> (536) . . (544) 
<223> Pbx-la regulatory fragment 

<400> 11 

gcccctgtaa tacqactcac tatagggcac gcg-ggtcga cggcccgggc tggtccgatt 60 

aatgtttgac aattaagcat ttttatgtga ttcatctcaa taagagccca agcaaggtaa 120 

tgtaaacaat gaaggcataa aaaggtggaa cgagacgtga acactacacc aagtgggcca 130 

catggaccta aggaaaaaga ggagctgagt tgaagtcagg agctgattca gataaagcag 240 □ 

cttcaactct actacttctt actcttagat gggctccaga ccaccttgag ggacgggact 300 

atatttacct tacagatgca ctcctttctc ccactaatac agtcagttag gaacgtatag 360 

ctaagaatga gccaaccatt actactcgtg tzc-acggcta cagaaactca ataaatcaac 420 

tcactattct gggtcaataa atgaactcat tatactgttt cttcatgaag taaatgtgct 480 

tctaaaggaa tgtattaaaa ttacattt-t aaaaatctat gtatcttgca tatatttgat 540 



'J 
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tgacacagaa aaggagtctia ctittcgttac ct"- cr"- -'ccacaaac C'cattcatt 600 

ggcgtgatct gaaaggggat aatatttcag aaagcaaaad ca-tt-taac aacaaaaccc 660 

rgtatticca g-gcgacc-ca aacacaaaac zazzz5.&=.=.z =zaacza~za zzzczczzct. 720 

ac-ggcraaa c = i-gccagt tgr-ctagg-g caa-caa^aa taacccar-a a'cgccatttt 750 

cataggaatt ggc = agcaat tgccaac-ac agc-3c::aa= agcccatar^ igccacagga 5<10 

gcctgcgatt ttaaatacrg ttcaaatcat Cta-gc'-ga gagcca^aca Lagcacaaac 900 

acacacatici: gnauccacag gucnaaagag gaaaaac-ag acaaaaagca ac::tgaaaag 960 

gaacccc-ct aur^iLCaacg acctaccwtt gt:gc-g-atr gagaaaaagt aiiaaaaacca 1C20 

tttgggagga agaaaaatat tT:ct:aagcac caatcrca-a ggcaaaagcc ctattaccct 1080 □ 

ttatgatagg ctaggggaaa tcaataatga tt'aaaagca aacagcaaac rgctcacatc 1140 ~1 

taccagtcta tactcagtat taattctacc rr.gr.agcgra tgctatagaa ggtaatgaaa 1200 

tgctgatgct ttctggtttt ccacttcaat ccgctccagc tttatcccaa accttgtcag 1260 

caacagaagt taccccccag caatcataca aacctaacaa cggcttccct ggcttaattt 1320 

ga-ctgctcc aatcacaaat tccaatccac anacttc^i" ctgttacggt tgngtctgta 1330 

cgcgtttaai: gtactttcca taaggctttt ttacaacaag cgcatgg^cr rncatccagc 1<340 

cctccaccac acaaacagaa ccactgaaga gcatttttg- rcatataaaa atgttctagt 1500 

tactcatagt gtangtcc-c aattggctgc agacracagc gaacaaaagt attgctgcca 1560 

tcattt-aca tttcccctgr tactgcgccc tatctgcLgt gcacccctgc ccccaaagca 1620 

tttattacaa tatttgctgc trgctctcag c-Ctggactg agatgctegg gggtgggaaa 1630 

acttaaaac- cgcctgactc aagtactctc caagctttga -cccagcacc Qzgcztt^&c 1740 

ctgctcccgt ;:ggacctctg ccactttact gaccccataa aacaaacaga gc-agcaaac 1800. 

gcatcaaaca gccccctrca aaacgcagct cgaccgrgar: tcatggcaca aagcrggaa:: 1S60 ^ 
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aaccc^gcca 9t.-=ccacaa cc---a.=g^' zz?.-.7r.-: 
agcacctnaa aaacgaatca zz"cz=.zzz c^cz^zzzzz 



34 

r.-ar.r.T.zi'.r..^. cacaragtiat 1520 
laczcacaac cactcccccc 1580 



cacggcatgt aaTL-aacc^s ctgagcttaa aaaaaaatat raggaccacc cacacagtcc SO'JO 

-gaagatgca azczzaqzza ---aggacac =aacazccaa jgrcg-gtcs gaacucaatc 2100 

acgaczacat atgcattaac gcacgaac::g gcacgcc-^a zggracgcza actiaraggac 2150 

ccgcgct-cr cg::acgctgg gcraraa^cc acgaaacT:^^ ^c^ccagazc cagccaarca 2220 

c-tagctccc ca-aacaagr ciaactggct crggaaag=- raaacggccg caciiuyaac:^ 2230 

^-.cacagarna g^^ r .= r.r.crac acatcaaccL acrtiECC.:- aaiicactcta cccc-aaagg 2340 

ccagagaaaa a-cacagcL- ccc-gccgga ggggaaaaca agggc 2385 



<2i0> 12 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> ?CR primer 
<220> 

<221> in.isc_binding 

<222> (1) . . (23; 

<223> PGR printer BL-:t\13F 

<400> 12 

gtaaaacgac ggccagtgaa ttg 



□ 



23 



<210> 13 
<211> 24 
<:212> DNA 

<213> Artificial Sequence 
<220> 

<223> DescripcLon of Artificial 
<220> 



?CR orimer 
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<.ZZl> misc_i: inding 

<222> U) . . i24 ) 

<223> PCR priir^er EL-M13R 

<400> 13 

acacaggaaa cagc~a~gac cavc 

>:2I0> 14 

<21I> 7 

<212> PRT 

<2i3> Komo sapiens 
<220> 

<221> BINDING 
<222> 

<223> Consensus NAD(H) cr MADP(H) cir.dir.g size domain 
<220> 

<221> V.JUIIANT 

<222> (2) . . (4) 

<223> Xxx amino acids are not: conserved 
<220> 

<221> VARIANT 

<222> {6) 

<223> Xxx amino acid is not conserved 

<';oo> 14 

Gly Xaa Xaa Xaa Gly Xaa Gly 
1 5 



<210> 15 
<211> 5 
<212> PRT 

<2i3> Homo sapiens 
<220> 

<221> DOMAIN 
<222> (1) - . (5) 

<223> Consensus dehydrogenase/reduc-rase catalytic domain 
<220> 

<221> VARIANT 
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<Z22> i 2 ) . . ( 4 1 

<22 3> Xxx amino acids are nci conserve- 



<;oo> 15 

Tyr Xaa Xaa Xaa Lys 



-:210> L5 
<:ii> 27 
^:212> DKA 

<213> Ar-ificial Sequencs 



<220> 

<223> Dsscripcion of ^rtiflcial 3equsr.ce: ?CR pri.-i\er 



<220> 

<221> nii5C_bindinq 
<222> (1) . - (27) 

<223> ARSDRl RACE forward PCR primer 



'C'lOO 16 '-^ 

cgacagcatc tLcctga-ct cgggggc 27 D 

a 

<210> 17 ■ " 

<211> 25 - 

<212> DNA ^ 
<:213> Artificial SeqiiencR 

<220> - 

<223> DescriTDtion of Artificial Sequence; iCR primer I 



<220> 

<221> inisc_binding 
<222> (1) . . [251 

<223> ARSDRl RACE reverse FCR primer 
<400> 17 

cagaaggagg agcaacagcg ggaac 

<210> la. 
c211> 22 
<2I2> DNA ^ 
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:7 

<213> Artific-a- Sequence 
<220> 

<223> Descripricn of Artificial Sequence: ?Z7, Fri.'nsr 
<220> 

<221> nisc_bir:c:inc 

<222> u) . . c:; 

<223> ARSDRl primer 6WN1 

<400> IS 

ccaaagagct ccc'cagaga gg 



<210> 15 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
<220> 

<:22l> misc_binding 

<222> (I) . . (21) 

<223> ARSDRl ?CR primer 6A4N2 

<400> 19 

ctgggtgaag aggatgttgg c 



rCR Prinver 

G 
G 
□ 

21 



<210> 20 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Androgen 
Response Element 

<220> 

<221> protein_bind 
<222> (1) . . (IS) 

<223> Consensus androgen response element 
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■;220> 

<221> rrdsc_di;ference 
<222> (7) . . (?) 

<225> Nucieoii.ie seqe-ce r.oz conservecL 
<4Q0> 20 

crgwacannr-C gzz~~ 



<210> 21 

<:211> 9 

<212> DUA 

<213> Homo sapiens 

<220> 

<22l> proceir._bir.ci 

<222> (11.. (9) 

<222> Interieukin response eleraenu bindir.c site 

<400> 21 
ttcccagaa 

<210> 22 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<:223> Description o£ Artificial Sequence: ?CR Primer 
<220> 

<221> misc_bin.ding 
<222> (1). .!251 

<223> ARSDRl chromosome location pr:Lmer 6A4? 
<400> 22 

ggggcatttc cttacattgt ccttg 

c210> 23 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Arti:i"i = i £aqu = r.-s: rC?. ^ri.-ner 
<220> 

<221> m.i5C__binding 
<222> (i) . . (25) 

';223> ARSDRl chronosome icca-icn cri.-r.= r c.-J.?. 
<AOQ> 23 

cacLCcaaac aagtgatggg aacac 

<210> 24 
<2ii> 25 
c212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Seqi:ence: ?CR Primer 
<220> 

<221> misc_binding 
<222> {II.. (251 

<223> ARSDRl PCR primer 5A4insitul 
<400>- 24 

tcttcattca gaaaiidttat cttag 

<210> 25 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Seq^jence: ?CR Primer 
<22Q> 

<221> mi3C_binding 
<222> (1) . . (27) 

<223> ARSDRl FCR Primer 6A4insitu2 
<400> 25 

gacagttcaa cataaattaa gtaaaac 



SUBSTmnX sheet CRXJLX 26) 



wo 00/65067 



PCT/USOO/10920 



40 



<210> 25 
<211> 30 
<212> DNA 

<213> Ariiificisl Sequence 
<220> 

c223> Descriprion of Arrifici,sl Sequsr.rs: :?.Z Frimer 
<220> 

■^Z2l> mi5C_binding 
<222> (i; . . (30) 

<223> THPRSS2 gene specific primer ■,'"£32 9-"!?^ 
<40G> 2S 

tgagc^caaa gccaccttcc -gttatcaac 



<210> 27 
<211> 22 
<212> DNA 

<213> Arcificial Sequence 



<220> 

<221> mi5c_binding 
<222> (11 . . (22) 

<223> cDNA library adaptor sequence pri.T.er API 
<400> 27 

gtaatacgac tcactaragg gc 



<210> 28 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: ?CR Primer 



<220> 

<221> misc_binding 



<220> 

<223> Description of Artificial Sequence: ?CR Primer 3 



J 



22 
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<222> 1 1 )..<:" ; 

<223> TMPRSS2 ^ene specific pi:icier a"'E32&-151R 

<nOO> 28 

ccaccctaat accacT:cact acagggc 



<zZlQ> 23 
<Zll> 19 
<212> DNA 

<213> Artiiiciii Sequence 
<220> 

<223> DescripLicn cf Artificial Sequence: Primer 
<220> 

<221> rr.isc_bin.ding 
<222> (1) . . (19) 

<223> cDNA library adaptor sequence primer AP2 
<400> 29 

acratagggc acgcgtggt 



<210> 30 
<211> 18 
<212> PRT 

<213> Homo sapiens 
<220> 

<223> Description of Artificial Sequenca: rCR Primer . 
<400> 30 

Lys Vai lie Ser His Pro Asn Tyr Asp Ser Lys Thr Lys Asn Asn Asp 
1 5 ^ 10 15 

lie Cys 



<210> 31 
<211> 16 
<212> PRT 

<213> Homo sapiens 
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42 



<<joo> ~i^ 

Lys Leu Gin Lys Fro Leu Tr.r 
1 z, 



?he Asn Asp leu 



Vai Lys Pro Vai Cy: 
15 



<2i0> 32 
<211> 18 
<2L2> PRT 

<2I3> Homo sapiens 



<400> 32 

Cys Trp lie Ser Gly 
1 S 



Lv .-.ia Thr GIu Glu Lys Gly Lys Thr Ser 
10 13 



Glu Vai 



<210> 33 
<211> 23 
<2i2> DNA 

<213> Artificial Sequence 
<220> 

<223> Description o£ Artificial Sequence: FCR Primer 



□ 
□ 
D 



<220> 

<221> misc_binding 
<222> (1) . . (23) 

<223> PART-1 RACE prime i: 1407 -19 5L 



<400> 33 

gtgacggtct tggacagtaa ggg 



23 



<210> 34 
<211> 24 
<212> DNA 

<213> Artiticial Sequence 
<22n> 

<223> Description of Artificial Sequence: ?CR Primer 
<220> 

<221> misc^binding 
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<'i22> [i) . . 124 ; 

<223> PART-l RACE priiner 14D7-S5L 
<400> 34 

agagtattgt -ggcttcgrc tgcc 



<2^0> 35 
<211> 24 
<:212> DMA 

<2i3> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : ?C?. Primer 
<220> 

<2 21> mlsc_binding 
<222> (1) . . (24) 

<223> PART-l PGR primer 14D7RC3 
<400> 35 

ctttcccctc cgacaaggaa gctg 



<210> 36 
<2I1> 26 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: ?CR Primer 
<220> 

<221> misc^binding ^ 
<222> ■ (1) . . (26) 

<223> PART-l PCR primer 14mRC4 
<400> 35 

ctcatctgtg ttgttccagt gcagcc 



<210> 37 
<21i> 23 
<212> DNA 

<2i3> Artificial Sequence 
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<220> 

<223> Descripcicn cf Amificiai Sequer.-e: FC?L PriT.er 
<220> 

<221> misc_bincling 
<222> (1) , . (23) 

<223> PART-1 PCR primer L4D7nanR 
<4C!0> 37 

igctttgu^a agatgaggca cgc 

<210> 58 
<21i> 26 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Description of Arrificial Seque.'cs: PCR Primer 
<220> 

<221> inisc_bindir4g 
<222> (11 . . (26) 

<223> PART-1 PCR primer 14D7ri;iapF 
<400> 3a 

cactccaggt gtcatggaca aagagc 

<210> 39 
<211> 24 
<2i2> DNA . 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: PCR Primer 
<220> 

<221> inisc_bin,ding 

<222> (1) . . (24) 

<223> 3C3 Primer 8C3mapR 

<400> 39 

tggcttcctc cctccatttt agag 
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<210> ^0 
<211> 24 
<212> CNA 

<2i3> Artificidi Sequence 
<220> 

<223> Description of Ar'^ificial Sequence: 
<22Q> 

<221> nisc_bir.ding 

<222> CI) . . (24) ^ 

<223> eC3 PriT'.er 8C3inapF 

<400> 40 

qgtqtcaaaa aactggcaca ccag 



<210> 41 

<211> 22 ° 
<212> DNA 

<213> Artificial Sequence 

□ 

<22Q> ^ 
<223> Description of Artificial Sequence: PGR Primer " 



<220> 

<221> misc^binding 
<222> {1) . . (22) 

<223> 8C3 RACE PCR primer 170L 
<400> 41 

ccggagtgac acagcgagac cc 



<210> 42 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: PCR Pri.ner 
<220> 
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<221> misc_binding 

<222> ( 1) . . (24) 

<223> 8C3 RACE ?CR crimer 43L 

<400> 42 

c-gatgtgcc accttc-::3a cacc 
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